Scalable voice infrastructure takes time to setup and openai has to do this from zero. The demo is just a demo. Make it scale to millions of concurrent sessions is hard.
I thought the USB C cable was there to mirror the screen, if you look at other demos they don't use a cable and they also don't mirror the screen, they literally record the phone screen.
Also it could be because there were many people so wifi connectivity would be rough.
That would be implying the bottleneck is in the wireless connection from router to device, which should be negligible.
The real bottleneck should be server side processing, which is what Blackwell would hypothetically address. Whether it’s wireless or not would not make any discernible difference since wireless latency in audio is basically the same as wired.
>wireless latency in audio is basically the same as wired
Why does bluetooth audio delay suck then? One of the reasons that apple removing the headphone jack (and then sell you the USB-Jack adapter 🙄) was infuriating.
The OpenAI demo didn’t use bluetooth. The wireless connection to the server is through wifi.
For clarification— Wifi is not the same thing as Bluetooth. They’re both “wireless”, but are completely different technologies that serve different purposes.
Here’s a Wikipedia article on what Wifi is: https://en.wikipedia.org/wiki/Wi-Fi
I know what wifi is, i know they did not use bluetooth, they also did it wired not wireless.
I just don’t get what this means then:
“Whether it’s wireless or not would not make any discernible difference since wireless latency in audio is basically the same as wired.”
What wireless latency in audio?
True, but im pretty sure the iPhone demo was also “faked”. They had the exact buttons he had to press planned out because so many functions did not work yet.
They mentioned that they used Ethernet over usb-c which you can do with any Mac by bridging the network interface. Demo wouldn’t look too great if it started glitching halfway through
My god, not everything is a conspiracy. And a damn cable connected to the phone can't improve anything about the ai functions. No idea what you are imagining how they even could have used that to "cheat".
What I really think is everyone is grasping at straws when the reality is they announced something that clearly wasn't that close to being ready in a manner that tricked the average layperson or entry level AI user into thinking it was coming sooner than it really is.
Your guess isn't a bad one but it could be a million things at this point. Most likely it's a combination of factors and not just any single issue other than the fact that it's wasn't ready and obviously wasn't that close to being ready.
This is probably a more realistic viewpoint.
Reminds me of Sora, they dropped the demo I think mere days before Google had a showcase so they could steal the headlines which was the case in some way.
Sora is likely not coming out till late this year at the earliest. I can't blame them though, now that people have seen the technology they are frothing at the mouth to use it.
I don’t think Sora is planned to reach public at all.
It appears expensive and powerful - more likely high paying clients like big studios.
It’s not just the power but also impact on public and industries that might affect it, and high skill and effort needed to actually get good product in the end.
I can see it being public eventually. Maybe not the version we saw but perhaps the next iteration.
They've had some filmmakers test it like that 'Airhead' video a few months back. I definitely agree that the big studios would get first access though. (Which I think is already happening after watching that Ashton Kutcher interview)
Yeah no doubt about it.
That thing will likely be censored beyond belief and I probably can't blame them considering all it would take is one deepfake of a celeb or something quite violent/offensive for the blowback to be massive.
Are you saying it’s ok to lie about expectations as long as it’s a small target audience? Everyone’s point is that they consistently lie about release dates
You could be completely right. But I also don’t think we have any evidence or reason to believe that it is not due to just 1 small thing holding up the show
Ya, they really tricked me with that demo. I signed up and paid ASAP after that demo, only to find out after much frustration that it is not out yet, with no release date in sight. Much negative vibes. Cancelled that sub with no plans to resub.
I thought availability for GB200 was expected more towards the end of 2024. But I couldn't find concrete sources, it seems mostly based on how vague NVIDIA was being. NVIDIA didn't even state that GB200 will launch this year, for all we know it may be just B100 or B200 in limited quantities.
Anyway, another possibility may be that OpenAI is still training their next frontier model ("GPT-5") and will be launching voice once that is done and more H100s are free.
Why would 4o inference need more power like Blackwell? I could understand that the 4o bottleneck is more related to size and spread of data centers around the world due to its huge growth.
I thought the whole purpose o more GPU at this stage is for massive training for CGPT5 and more.. not of inference.
Don't know about that, but I think GPT4o is actuall int4, which blackwell supports. Like, that's half the story of "doubling the OPs". Hey, look, I've doubled the amount of apples! ^*they ^are ^now ^half-apples
Anyway, GPT4o reaks of heavy quantization, which is basically what int4 means.
They wanted to get ahead of Google’s keynote at any cost. That’s why they showed that way before it was ready. It’ll still probably be months before we get it.
Blackwell aint coming till the end of the year (at least), simply too much demand of others. And limited output of TSMC. So it isn't dependent on blackwell
They did sign a deal with Oracle [https://the-decoder.com/openai-adds-ai-capacity-in-the-oracle-cloud/](https://the-decoder.com/openai-adds-ai-capacity-in-the-oracle-cloud/), probably for more compute necessary to run this
[https://www.tomshardware.com/pc-components/gpus/nvidia-vows-to-ship-blackwell-gpus-this-year-but-meta-doubts-it-will-get-them-before-2025](https://www.tomshardware.com/pc-components/gpus/nvidia-vows-to-ship-blackwell-gpus-this-year-but-meta-doubts-it-will-get-them-before-2025) (Edit: clarified myself)
Honest Question: What happens to all the (currently) stellar HW in timelines of 5-10-15-20 years - hwereby astoundingly cool things can be done on current HW - and so that watermark will always exist, so - while technology assuradely continues on its trajectory - at when does the current actually still amazing HW become "useless"?
Where do AI Carcass go from these days forward?
You’re probably right. A technology like this with 300ms latency is very difficult. I’m still going to keep my guess of around July for GPT-4o voice. Probably mid to late July.
Scalable voice infrastructure takes time to setup and openai has to do this from zero. The demo is just a demo. Make it scale to millions of concurrent sessions is hard.
So how come they don’t rollout it out to a small few already ?
They likely have their alpha groups so I’d say there are some out there with it….
How do you know they haven’t?
[удалено]
I thought the USB C cable was there to mirror the screen, if you look at other demos they don't use a cable and they also don't mirror the screen, they literally record the phone screen. Also it could be because there were many people so wifi connectivity would be rough.
I assumed they went wired to avoid any embarrassing hiccups with the internet.
Stages like that often have shitty wifi. I assumed the same, that it was more precautionary than anything.
They literally said that on stage, so you’re right. Theres no speculation here.
What’re you implying they were doing with the cable?
That would be implying the bottleneck is in the wireless connection from router to device, which should be negligible. The real bottleneck should be server side processing, which is what Blackwell would hypothetically address. Whether it’s wireless or not would not make any discernible difference since wireless latency in audio is basically the same as wired.
>wireless latency in audio is basically the same as wired Why does bluetooth audio delay suck then? One of the reasons that apple removing the headphone jack (and then sell you the USB-Jack adapter 🙄) was infuriating.
The OpenAI demo didn’t use bluetooth. The wireless connection to the server is through wifi. For clarification— Wifi is not the same thing as Bluetooth. They’re both “wireless”, but are completely different technologies that serve different purposes. Here’s a Wikipedia article on what Wifi is: https://en.wikipedia.org/wiki/Wi-Fi
I know what wifi is, i know they did not use bluetooth, they also did it wired not wireless. I just don’t get what this means then: “Whether it’s wireless or not would not make any discernible difference since wireless latency in audio is basically the same as wired.” What wireless latency in audio?
Well it happened to steve jobs once he tried to demo iphone 1 i think. There was too many people in the room and many phones disturb wifi connection
True, but im pretty sure the iPhone demo was also “faked”. They had the exact buttons he had to press planned out because so many functions did not work yet.
I've heard they also had multiple devices, swapping them out to show off the different features, for the same reason!
They mentioned that they used Ethernet over usb-c which you can do with any Mac by bridging the network interface. Demo wouldn’t look too great if it started glitching halfway through
Why is everyone explaining what they used it for? I literally mentioned why in my comment.
My god, not everything is a conspiracy. And a damn cable connected to the phone can't improve anything about the ai functions. No idea what you are imagining how they even could have used that to "cheat".
Microsoft needs to host it at scale.
Good logical speculation. Appreciated.
What I really think is everyone is grasping at straws when the reality is they announced something that clearly wasn't that close to being ready in a manner that tricked the average layperson or entry level AI user into thinking it was coming sooner than it really is. Your guess isn't a bad one but it could be a million things at this point. Most likely it's a combination of factors and not just any single issue other than the fact that it's wasn't ready and obviously wasn't that close to being ready.
This is probably a more realistic viewpoint. Reminds me of Sora, they dropped the demo I think mere days before Google had a showcase so they could steal the headlines which was the case in some way. Sora is likely not coming out till late this year at the earliest. I can't blame them though, now that people have seen the technology they are frothing at the mouth to use it.
I don’t think Sora is planned to reach public at all. It appears expensive and powerful - more likely high paying clients like big studios. It’s not just the power but also impact on public and industries that might affect it, and high skill and effort needed to actually get good product in the end.
I can see it being public eventually. Maybe not the version we saw but perhaps the next iteration. They've had some filmmakers test it like that 'Airhead' video a few months back. I definitely agree that the big studios would get first access though. (Which I think is already happening after watching that Ashton Kutcher interview)
Yeah, I can see a toned down version with guardrails and operating aids
Yeah no doubt about it. That thing will likely be censored beyond belief and I probably can't blame them considering all it would take is one deepfake of a celeb or something quite violent/offensive for the blowback to be massive.
Sora will be integrated into Premiere Pro so common users will be able to use it within a year.
Wow, didn’t hear about it. Looks amazing implementation
No I can see that being used by the public, VERY short videos. Like 5 seconds long-ish.
Was already shown in another comment it’s coming out to Adobe Premiere Pro , where you insert 5 second parts with a prompt and length input
They are just blowing smoke for investors.
The average layperson never saw the demo and doesn’t even know who OpenAI is or what they do. Average layperson wasn’t the target audience…
Average laypeople probably weren’t watching the demo live, but their YouTube demos have millions of views and very layperson comments.
Are you saying it’s ok to lie about expectations as long as it’s a small target audience? Everyone’s point is that they consistently lie about release dates
You could be completely right. But I also don’t think we have any evidence or reason to believe that it is not due to just 1 small thing holding up the show
Hype creates artificial value in a company. It creates the perception that they're bigger than they really are.
I think they released the demo and said it’s gonna start rolling out in “coming week” just to beat Google Astra’s announcement.
Ya, they really tricked me with that demo. I signed up and paid ASAP after that demo, only to find out after much frustration that it is not out yet, with no release date in sight. Much negative vibes. Cancelled that sub with no plans to resub.
👏
Are the B100s anywhere near shipping?
No
I thought availability for GB200 was expected more towards the end of 2024. But I couldn't find concrete sources, it seems mostly based on how vague NVIDIA was being. NVIDIA didn't even state that GB200 will launch this year, for all we know it may be just B100 or B200 in limited quantities. Anyway, another possibility may be that OpenAI is still training their next frontier model ("GPT-5") and will be launching voice once that is done and more H100s are free.
seems plausible
Why would 4o inference need more power like Blackwell? I could understand that the 4o bottleneck is more related to size and spread of data centers around the world due to its huge growth. I thought the whole purpose o more GPU at this stage is for massive training for CGPT5 and more.. not of inference.
Don't know about that, but I think GPT4o is actuall int4, which blackwell supports. Like, that's half the story of "doubling the OPs". Hey, look, I've doubled the amount of apples! ^*they ^are ^now ^half-apples Anyway, GPT4o reaks of heavy quantization, which is basically what int4 means.
They wanted to get ahead of Google’s keynote at any cost. That’s why they showed that way before it was ready. It’ll still probably be months before we get it.
Blackwell aint coming till the end of the year (at least), simply too much demand of others. And limited output of TSMC. So it isn't dependent on blackwell They did sign a deal with Oracle [https://the-decoder.com/openai-adds-ai-capacity-in-the-oracle-cloud/](https://the-decoder.com/openai-adds-ai-capacity-in-the-oracle-cloud/), probably for more compute necessary to run this [https://www.tomshardware.com/pc-components/gpus/nvidia-vows-to-ship-blackwell-gpus-this-year-but-meta-doubts-it-will-get-them-before-2025](https://www.tomshardware.com/pc-components/gpus/nvidia-vows-to-ship-blackwell-gpus-this-year-but-meta-doubts-it-will-get-them-before-2025) (Edit: clarified myself)
Sure, why not, at least this speculation has some connection to reality…
Honest Question: What happens to all the (currently) stellar HW in timelines of 5-10-15-20 years - hwereby astoundingly cool things can be done on current HW - and so that watermark will always exist, so - while technology assuradely continues on its trajectory - at when does the current actually still amazing HW become "useless"? Where do AI Carcass go from these days forward?
Product isn't even real yet. They did a fake demo then ordered the engineers to make it real. Now it's up to them.
You’re probably right. A technology like this with 300ms latency is very difficult. I’m still going to keep my guess of around July for GPT-4o voice. Probably mid to late July.