Justin Uberti

5.4K posts

Justin Uberti banner
Justin Uberti

Justin Uberti

@juberti

Head of Realtime AI @OpenAI. Created WebRTC. Past: CTO @ultravox_dot_ai, Distinguished Engineer @google (Stadia, Meet/Duo), AIM. Amateur mathematician/musician.

Seattle, WA Beigetreten Şubat 2007
120 Folgt14K Follower
Angehefteter Tweet
Justin Uberti
Justin Uberti@juberti·
We’ve integrated the ChatGPT Voice “orb” into the chat view in our latest iOS release, and you can toggle between integrated/fullscreen view. I find chatting with the orb feels more engaging, like there’s an actual thing you’re talking to. Thoughts?
Justin Uberti tweet mediaJustin Uberti tweet media
English
67
9
346
36.7K
Justin Uberti
Justin Uberti@juberti·
@jezell @stevendcoffey The need makes sense, just wondering if you can approximate it by continuing to append results to the same call.
English
1
0
1
24
Jesse Ezell
Jesse Ezell@jezell·
@juberti @stevendcoffey Imagine for example a tool call which runs a docker build. It’s a very long process. You actually want it to return results incrementally, not the whole chunk at the end for both context length issues and general agent awareness issues.
English
1
0
1
31
Jesse Ezell
Jesse Ezell@jezell·
@stevendcoffey @juberti I think there's a really big gap in the Responses API right now in that there is zero support for streaming tool calls. Now that there are websockets and things are moving more and more to be realtime with voice models on the way that change the interruptions model, the whole tool calls need to be request / response interactions thing is gonna get old fast. Sure, you can make some tools to do things like spin up background terminals and grep over their logs, but real systems are full of realtime events and signals and the current APIs just don't map to that world. I'm looking forward to whatever is the future where the Realtime API + Websocket Responses API are unified into something that can handle those workloads.
English
1
0
2
278
Justin Uberti retweetet
Sebastien Bubeck
Sebastien Bubeck@SebastienBubeck·
GPT-5.4
Sebastien Bubeck tweet media
43
52
806
93.2K
Santiago Afonso
Santiago Afonso@SantiagoAfonso·
@athyuttamre @juberti My assumption here is that you cannot bring 5.2 xhigh (2-6 minutes of thinking) or pro level intelligence (10+ minutes thinking for the easiest answers) to realtime in the next 36 months. I hope I'm wrong!
English
2
0
1
40
Justin Uberti
Justin Uberti@juberti·
We’ve integrated the ChatGPT Voice “orb” into the chat view in our latest iOS release, and you can toggle between integrated/fullscreen view. I find chatting with the orb feels more engaging, like there’s an actual thing you’re talking to. Thoughts?
Justin Uberti tweet mediaJustin Uberti tweet media
English
67
9
346
36.7K
Justin Uberti
Justin Uberti@juberti·
@wayne_culbreth Realtime API models are separate from the ChatGPT Voice models since 3p customers have somewhat different needs
English
1
0
3
671
Justin Uberti retweetet
Peter Bakkum
Peter Bakkum@pbbakkum·
gpt-realtime-1.5 is the best native audio model on the Scale AudioMultiChallenge benchmark -- this is a significant jump in capability by this measure. There are models that outperform it but they are reasoning models without native audio output.
Peter Bakkum tweet media
English
13
15
179
24.4K
Justin Uberti
Justin Uberti@juberti·
@adamac hmm. It should be instantaneous. That said, we are working on a number of optimizations to improve this flow (including the smoothness of the UX)
English
0
0
2
33
Adam MacBeth
Adam MacBeth@adamac·
@juberti In a new chat or the same chat. Doesn’t help that the animation is janky as that captures my attention.
English
1
0
0
30
Adam MacBeth
Adam MacBeth@adamac·
@juberti The time it takes to connect (spinner) is a hindrance to using voice mode imo. Seems like it’s at least 3 seconds every time even when toggling back and forth. Why isn’t this instantaneous?
English
1
0
0
41
Fabian
Fabian@fabrohl·
@LearnAI_MJ @juberti You drag the circle conversation bubble up into the main view.
English
1
0
1
75
Justin Uberti
Justin Uberti@juberti·
interesting! a bit uncanny but definitely shows that some amount of singing with gpt-realtime-1.5 is possible...
stephen 🌿@stevelizcano

@juberti been experimenting with things like this and realtime-1.5 it's pretty fun and makes it engaging

English
1
0
10
2.2K
Justin Uberti
Justin Uberti@juberti·
Yeah. This is the number one AVM complaint I hear these days. The amazing progress from reasoning models has made the text/audio intelligence delta much more obvious.
CommonSenseOnMars@CommonSenseMars

@juberti Agree, nice option. Just hoping for a legit voice model intelligence update eventually. Tbf OpenAI has been ahead of all the other AI lab voice models for over 1.5yrs now which is impressive

English
1
0
29
2.6K
Justin Uberti
Justin Uberti@juberti·
You can easily make the model whisper via prompting. Singing is another story, as there are nontechnical restrictions there. You can experiment at platform.openai.com/audio/realtime
Sava@savamusics

@juberti Hi Justin, I tested it early in the morning. Will the final model be more expressive in terms of its tone variation and voice volume? It doesn't slow down the speed of its speech as much I would like and doesn't whisper. Also, it doesn't sing when asked to.

English
1
0
10
1.5K
Justin Uberti
Justin Uberti@juberti·
APIs remain the same, so this is a seamless upgrade for existing gpt-realtime users.
English
1
0
6
1.5K
Dave Zatz
Dave Zatz@davezatz·
@juberti 15 mins today, no issue. Not that you asked, the vocalized and non-vocalized pauses are off-putting in some way I can't put my finger on. :)
English
1
0
0
51
Dave Zatz
Dave Zatz@davezatz·
Slightly more convenient than hitting the action button to speak to Grok while commuting, as I do now. (ChatGPT and Gemini interrupt themselves over car speakers, so Grok is currently the only option.)
Mark Gurman@markgurman

NEW: Apple is preparing to allow voice-controlled artificial intelligence apps from other companies in CarPlay, a move that will let users query AI chatbots through its vehicle interface for the first time. bloomberg.com/news/articles/…

English
1
0
0
1.1K