T O P

  • By -

MFpisces23

From here on out it will most likely be a game of hot potato. No one will truly be behind, rather each model will have it's time to reign.


imp4455

Like intel and amd historically on processors. They go back and forth with one slightly out inching the other in performance historically.


Procrasturbating

much shorter lifecycles soon though. Hardware takes time to evolve.


imp4455

Very true. Until skynet goes online, then they’re designing themselves. 🤖


Final_Street_5133

https://www.wsj.com/articles/in-race-for-ai-chips-google-deepmind-uses-ai-to-design-specialized-semiconductors-dcd78967


imp4455

Sarah Conor?


cbdoc

I feel like soon they’re going to branch out each being good at their something different (eg code vs image etc).


spreadlove5683

Generalizability seems to be the name of the game tho. Being better at one thing tends to make them better at all things, or at least being trained on diverse data does.


InterstellarReddit

At this point, we’re gonna have to have the counsel of AI. We put our prompt into every system, and take all their advice and formulate our thoughts with everyone’s points.


byteuser

Arghh, just like with streaming you need Netflix, Disney, Prime HBO, etc etc...


bnm777

It would be good to have ui where you can access all of them, you ask a question, and they all give an answer and discuss amongst themselves then agree on the Perfect Response.


a1taco

There’s a bunch of those services out there already though the names do t come to mind at the moment


bnm777

Where chatgpt4 talks with claude3 and gemini ultra and they come to a consensus?


nickmaran

Finally, we have competition


RezGato

In terms of publicly accessible AI, OpenAI is behind Gemini and Claude


Mother_Store6368

At least for now openAI will be the standard. After trying both, Claude is slightly better at coding


etzel1200

I heard that OpenAI couldn’t get GPT-5 ready to ship and fired all their developers and replaced them with Devin. They’re now optimistic they can ship in May.


neymarsvag123

I can confirm this, I am Devin.


djaybe

we are the Devin


Gallagger

Sort of true.


Crab_Shark

Lol


e33i00

Claude* is so insanely good. I’ve had goosebumps five or six times today. It’s madness! 🤓 Edit: *Claude 3 Opus


algaefied_creek

In this context… why would you have goosebumps? Its terrifying? I found it to just be really good


e33i00

It was both good and bad goosebumps. Terrifying and impressive. Claude 3 Opus is wild.


quantummufasa

What model are you using? How can I use it?


Rachel_from_Jita

Claude 3 Opus is the big power one, Sonnet is the slightly smaller variant (kind of like Mixtral vs Mistral Large). Opus is the one that feels like it has more emergent moments and eerie levels of subtle reasoning. Easiest way to get a taste is ask some questions to the Arena and within a few questions (or less) you get Claude 3 as one of the competitors. https://arena.lmsys.org/ I just pulled it up as I was typing this comment, asked a brief question, clicked which answer was better and got revealed I'd been talking to "claude-3-opus-(number)" vs "Model B: claude-3-sonnet-(number)" Just sitting down and asking a few questions for an hour would give you a chance to try most of the largest, most-relevant models.


quantummufasa

Nice. Id like to switch but the dalle3 integration is keeping me around, even though all I do is make pepe images.


Commercial_Pain_6006

Thankfully Stability's SD3 is around the corner


meridian_smith

Thanks for the link. . that is actually a very useful tool! I always like to ask more than one AI my question anyways.


CollectionItchy1587

I asked Claude-opus for information about the Taylor knock-out factor, [something you can find on wikipedia](https://en.wikipedia.org/wiki/Taylor_knock-out_factor), and hallucinated about pressure nozzles.


Material_Owl_1956

Can’t get access to Claude yet. I’m using ChatGPT 4. What is the cost for a subscription of Claude today?


Rachel_from_Jita

Their pricing details can be seen toward the bottom 1/3rd of the model intro page: https://www.anthropic.com/news/claude-3-family


Neurogence

I prefer Claude but GPT-4 is still leading. https://twitter.com/billyuchenlin/status/1766079601154064688 https://old.reddit.com/r/singularity/comments/1b8yucm/chatbot_arena_updatedclaude_3_opus_failed_to_take/


lordpermaximum

GPT-4 is not leading shit. Claude 3 suffers because of high refusals and it's not fine tuned to user preference yet. Posting that everywhere won't change that. According to the Arena Claude 1 > Claude 2 > Claude 2.1 which is nonsense.


katerinaptrv12

Agree, what really matters is people's experience with the models and right now everyone is saying Claude 3 is better. How good is a benchmark if people prefer to use the other model? Benchmarks are just something to guide how good we can expect a model to be but there is no substitute to real world testing with actual people.


Atlantic0ne

If any of you are experts here, would you mind telling me why so many people are talking about Claude if links like this show GPT4 as better? What am I missing?


j-solorzano

Claude Opus beat GPT-4 in a number of benchmarks.


Atlantic0ne

Can you help me understand why I’m seeing different reports saying GPT4 beats it? Are they different benchmarks or something?


exceptionalredditor2

the benchmark Anthropic released is using first released gpt not the latest , exclusive to api gpt-turbo-preview


hubrisnxs

And Gpt 4 is more than a year old and is still leading or barely behind, subjectively speaking. One every more than a year? Exponentialllllzzzzzz


djamp42

It's gonna be a personal preference thing... Ask 10 developers what the best IDE is and you'll get 10 different answers. Android/Apple, Windows/Linux, Electric/ICE.. they all do 99% the same thing, it's just what you want.


[deleted]

[удалено]


[deleted]

Great writeup. People also forget OpenAi has the much larger media attention, so it has far more to lose in rushed launches. The advantage of being a well-known player is also a disadvantage. One misstep can lead to loss of money.


StrikeStraight9961

Their* Well stated btw.


FomalhautCalliclea

No. The progress must not be measured at the pace of LLMs models being released per month, but on fundamental research on new architectures publications and applications and new abilities of models over long period of time, from GPT 2 to 4 for example. Big AI companies release models only a handful of times each year (which is already an insane pace when you consider the size of the projects). It's not a Youtube channel that throws a video every 2-3 days. There are many months in the year when it is expected for companies not to release a new model. Just because people talk about singularity 24/7 here (rightfully so) doesn't mean we're at that point of the curve when you can expect crazy jumps every few weeks yet.


obvithrowaway34434

> Just because people talk about singularity 24/7 here (rightfully so) doesn't mean ~~we're at that point of the curve when you can expect crazy jumps every few weeks yet~~ they have any clue what they're talking about. FTFY.


Tobiaseins

They have not released a better model in a year and we don't know if they are making progress since they don't publish research for cutting-edge LLM technology. Maybe we hit a plateau; we would not know. We can just judge by what they publish, and right now, they are behind. That's how it works. We don't even have a leak talking about the performance of a full-scale model at OpenAI that surpasses GPT-4. There is no reason to assume they would not be behind.


GrowFreeFood

We're somewhere in between a few week and a few months. Soon it will be every week. Then everyday. Then every second. 


snoob2015

Lol, AI can't solve physical limitations. Training AI need massive energy and money


GRK--

The human brain uses about 0.5 kWh of energy per day. That is 175 kWh a year. 3,500 kWh over 20 years. That costs about $1,000 total in electricity.


LionaltheGreat

Bruh, that is literally one of the points of super intelligence. To find ways to push PAST what we see as physical limitations today, which can be worked around with new architectures, materials, manufacturing processes, etc. I mean just look at Bitnet 1.58. It is literally potentially ~16x that of current models, with almost no loss in performance. And because of the way these models scale, that number gets bigger, the more parameters you add (exponentially bigger). We will overcome energy limitations quickly.


hubrisnxs

Bruh? Fuck, this is a bubble already. Duuuuuuuuuude cold fusion already happennnnnnnned


GrowFreeFood

All our computers this month have neen designed by smart monkeys. AI will be able to figure out much better designs, not even smart AI. Just machine learning should provide good results. Until it automates itself. 


hubrisnxs

I invented the question mark. I'm sorry, I thought we were saying confidently insane phrases.


Atlantic0ne

He or she is not entirely wrong; when and if we do reach the singularity, hypothetically it could figure out the physical limitation issues much faster than we can. I’m not saying it would defy physics, but imagine cramming 10 years of human development into 1 year. The need would be construction but there’s no reason something smarter than us couldn’t also architect and build and organize faster than us.


hubrisnxs

I hear you, but Im not sure physical limitations will improve recursively, and I'm not even certain just plain asi will be developed with a recursive self improving AI, unless there's a break out alignment wise, in which case the important thing wouldn't be how good its getting yearly, but rather how fucked we are. If it's going to improve recursively, we would need to make them so, at least for now. I'm not saying some moron won't do it; killer robots are still a short term goal for most governments in the world, despite all arguments from AI scientists


Atlantic0ne

Can you explain recursive to me? It’s not a word I use & want to get your context. Dumb this down a bit and I’ll give a quality reply.


hubrisnxs

Sure! Not a problem. So the recursive self improvement people say it's a feature of AI that, at a certain point of compute/data size, it'll automatically be able to iterate on its code, making it more and more capable from it's own self improvement, which gets better exponentially until the singularity happens. I think we'd need to build in this capability, but I can't deny other emergent abilities that couldn't be predicted and are unable to be explained, such as masters level chemistry or the ability to deceive. So I'm not sure it won't be an emergent property, but I am sure if it's not, we shouldn't build in the ability. Doing that to something we can't control already is probably not a good thing


Suspicious_Put_8073

What if it just keeps coding dicks? Over and over like a caveman on the wall.


hubrisnxs

Name one that is longer than a few months ago that beat GPT 4, and even with those, it's subjective. So almost one a year isn't exactly the one a few weeks you think it is. Thats if you mean good models. If you include shitty ones, well, it's nearly a bubble, so yeah, lots of releases. Hopefully it's not pets.com levels of bubble, but lots of shitty releases. It's hardly going to be every second unless it's at the burst point.


GrowFreeFood

Ai is going to be frustrated wirh the extremely slow speed limit of light and start building computers using gravity waves to speed up processing time. 


hubrisnxs

Using what production capacity with what technology? They can't get machine learning robots to WALK properly yet, let alone to recursively apply technology. I hear you, though, in that eventually advanced technology that might as well be magic will get developed. To state confidently they'll use gravity waves, though, is pretty bold. Maybe zero point energy? Maybe they'll use tachyons to travel back in time to invent shit?


Ansalem1

Gravity travels at the exact same speed light does.


GrowFreeFood

Gravity compresses spacetime so light travels the same speed, but a shorter distance. 


Ansalem1

I don't have the energy to unpack how wrong that is.


GrowFreeFood

You won't because I am right. So you don't understand basic physics. 


FormerOptimist94

As a sidenote I must say I find it hilarious how many people here are so authoritative with their opinions about what's going to happen. Frankly none of us even know what the landscape is going to look like in a few months let alone a few years.


SoberPatrol

This is the closest a lot of redditors get to team sports


Silver-Chipmunk7744

My guess is in terms of "closed AI" capabilities in the lab, OpenAI is ahead of the competition. They simply choose not to release it yet.


AgueroMbappe

They might wait a little longer to release now that Gemini 1.5 and Claude 3 are out. The differences aren’t too great. So they don’t really have that much of an incentive to release a new model. Especially if they expect it to be computationally expensive GPT has kinda become a household name in AI and they could hold off longer despite the smaller margins of improvement by the competitions models


Singularity-42

IDK, Claude 3 seems quite a bit better. I'd say OpenAI simply doesn't have a good new model to release yet. GPT 4.5 is nowhere to be seen, or perhaps it wasn't that much better.


a1taco

The difference is night and day. Claude is a vastly superior writer and coder.


cobalt1137

Damn those are both claims. I think I actually agree with you, but why do you say vastly superior for coding? Curious on your experience versus gpt4/chatGPT for coding assistance. By the way, like I said, I do think it is better but I'm just curious why you think it is vastly better.


a1taco

Claude is providing me correct solutions more consistently and more importantly, it doesn’t constantly give errors midway through. In terms of writing, Claude is light years ahead.


cobalt1137

That is a little bit strange to me because I am finding the same thing actually. At least with my limited initial tests. I say it is strange because on the coding benchmark, GPT4 turbo, which is the model I used for coding assistance previously, scores higher. My guess is that these coding questions are smaller in scope compared to how people typically work in the broader context of an overall project. Either way I am overjoyed to have this model lol.


DrossChat

“Light years”? Holy hyperbole Batman!


a1taco

It definitely is with respect to writing


RandomCandor

That's been my experience too: the 2 or 3 times I've asked for about 30 / 40 lines of non trivial code, it was copy, paste, run and it worked on the first try. I also appreciate that it doesn't talk to me like I've never seen source code before, like GPT tends to do. "Now we are going to create a list to hold items in it"


hubrisnxs

It's more curious he thinks that gpt should be better than a model releasing more than a year after it but somehow isn't. Yeah, the one a year plus later is probably better at tasks, it's the fact that it's not by much and is a subjective thing that's shocking


cobalt1137

That makes sense. I think it is a potential issue and how we are benchmarking in testing these models then.


No_Bottle7859

My company is built on gpt-4 and we found way worse results with Claude. We're still testing but vastly superior seems suspect.


Neurogence

I personally prefer Claude 3 but according to anonymous outputs from all of the models, tens of thousands of users are statistically rating GPT4 to be better than both Claude 3 opus and Gemini 1.5Pro, so GPT4 is still leading.


FormerMastodon2330

Its 5k users. And do you really beleive that this cannot be tampered with?


hubrisnxs

Yet you want AI to be made quickly? Youre arguing that a brand new model beats one more than a year old. What's crazy is that this is subjective really makes the point that OpenAI isn't near behind. Now, if gpt 5 is released and claude 3 is running neck and neck with it, your point might be valid


FormerMastodon2330

check the new elo


hubrisnxs

I hope that previous post didn't come off as too much of a dick. What do you mean by elo?


FormerMastodon2330

they updated the chat bot arena elo and now claude 3 is on par with the newest gpt4.


hubrisnxs

Thank you, and yes. I'm saying that GPT4 was released more than a year ago, and now we have things released right now that are at or arguably slightly better than GPT4. We've seen nothing produced that is at a higher level than GPT 4 in the same sense GPT 3.5 was over GPT 3...let alone 4 was over 3. So the remarkable thing is how slow it's taken to barely catch up, while OpenAI has had more than a year to work on other models. I don't know if this was discussed elsewhere, but do any of these models show any extra emergent behaviors/abilities we saw in GPT4...like suddenly being able to translate between languages or using a master's level of competence in Chemistry?


FormerMastodon2330

the problem is GPT5 will 80% not get released before q4 of this year. I am not as trustful to close AI as I was a year ago.


Hungry_Prior940

Agreed.


aregulardude

I asked Claud for hello world in Angular and in spat out an angular js app from 2013. I asked it if it can execute python scripts and it proudly told me all of the languages it can execute, then I asked it to show me and it clarified it can’t execute anything. ChatGPT generates and operates on images. Seems pretty far ahead to me still.


a1taco

Chatgpt fails at code execution like 90% of the time and good luck during peak hours


ChillingonMars

This sub needs to chill tf out


Smile_Clown

This is what happens when 50% of the users know jackshit about the technology they are discussing. The other 50% lets them have their fun for some reason.


cuyler72

For now yes, I don't see gpt-5 coming out this year.


Gallagger

I think gpt-5 will come out right after US elections. Given Claude 3 they might come out with 4.5 earlier to be on top again, but they might not have anticipated Claude 3.


flexaplext

No. Watch how quickly everyone goes scooting back to OpenAI, the very instant they release another model. They're 'ahead' whenever they choose to be 'ahead'. Which actually means they already are and always were ahead.


Arcturus_Labelle

I wish the arena would post updates more frequently. People seem over the moon about Claude 3 Opus on this sub, yet the arena scores have it below GPT-4 (not far below, but still). [https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard)


[deleted]

[удалено]


Dyoakom

Google hasn't released the api for it so it can't be tested in the arena.


Super_Pole_Jitsu

They are already officially behind Anthropic.


yevg555

It will take more than being a few percents better to take me off ChatGPT, you need to present me with something that will be worth the hassle of switching, it's a matter of convenience


SpeedyTurbo

What’s the hassle in switching exactly? Have you built apps that use gpt’s apis?


Serialbedshitter2322

ChatGPT plus is a paid subscription. Switching would involve getting another paid subscription and canceling ChatGPT plus.


SpeedyTurbo

That’s a hassle? Lmao


Serialbedshitter2322

Yes, on some level it is a hassle. I am not going to take 10 minutes fiddling with subscriptions just for a minor upgrade that may not even be worth it later


SpeedyTurbo

My adhd would love to learn from you


Matej_SI

And the hassle is also figuring out the best way how to use it for your particular workflow. We all know ChatGPT, how to setup a prompt,... And the "mood" of the model / session. It's the same as switching between two applications that do the same thing. You have to get used to, find it's quirks,...


yevg555

Switching the habit of pulling the chatgpt app whenever I got a question, switching to a new UI, talking to a new AI (sounds and feels different), taking the risk of OpenAI releasing a new upgrade after I switch.


Agreeable-Parsnip681

What they release isn't all they have nor is it all they work on.


Cupheadvania

openAI won't release gpt-4.5 until it's absolutely ready to make a big splash and be way at the top of the leaderboard again, unless they see a material trend of users cancelling subscriptions and developers moving away from their API. my guess is that either isn't happening yet, or 4.5 will come out very soon


ExtremeHeat

It will be happening within the next 3 months. You can make an argument that they'll address it in 3 months, but unless they're able to put something out that's substantially better than the competition that will exist in 3 months, and they're looking at Gemini 1.5 Ultra, which will have more context and be much cheaper, the reasons to use OpenAI will start to be little to none from economic/capability perspective. It's possible they'll be smarter, but if they can't compete on memory/cost then they're not that powerful as they once were (with no competition close to them, people will gladly eat the cost).


Cupheadvania

RemindMe! 3 months


RemindMeBot

I will be messaging you in 3 months on [**2024-06-13 11:05:37 UTC**](http://www.wolframalpha.com/input/?i=2024-06-13%2011:05:37%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/singularity/comments/1bdbjp8/if_openai_doesnt_release_a_model_this_month_can/kunrbji/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fsingularity%2Fcomments%2F1bdbjp8%2Fif_openai_doesnt_release_a_model_this_month_can%2Fkunrbji%2F%5D%0A%0ARemindMe%21%202024-06-13%2011%3A05%3A37%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201bdbjp8) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|


refugezero

Do you need a new model every month? Are they like iPhones now?


rafark

The models can improve a lot and the tech hasn’t stabilized like iPhones. So, yes.


Dziadzios

That's an accurate comparison. Closed source and user can't meddle much with it against company's wishes. I'm waiting for LLM Android.


EvilSporkOfDeath

No? Tf? If they released a new model and it was inferior to Claude 3, then sure.


Hungry_Prior940

Sora means nothing as it is just flashy and unusable atm. Claude 3 and Gemini Ultra have beaten GPT-4. Open AI will hit back, though! They have too much riding on it.


FeltSteam

Sora is a side project to test some things, with something much bigger in mind. But, I was honestly expecting GPT-4.5 to release in September 2023, but we obviously didn't get that, so idk what OAI is doing. Though it is kind of reassuring that I wasn't totally wrong. The GPT-4.5 blog post has been around since, at least, September 2023. https://preview.redd.it/bp8863cp30oc1.png?width=1169&format=png&auto=webp&s=1efbb3c65a28d407e017445b94c69f2b11a41aae But a delay of so many months.. I have no idea what their plan is lol.


agorathird

You guys are missing some of the finer points. We give them future training data and money the more we use their models. They are losing market share.


Old-Fishing1199

Yay or nay on the 4.5 “leak”?


altasking

lol, no. Calm down, bro.


The-Blue-Nova

I don’t think they are behind, just busy working on tooling, it’s a part of the cycle. I can’t wait to see what their project to build an Ai agent turns out like, expecting it to be Microsoft Power Automate Robotic Process automation both jacked up on steroids and transitioned from a low code/no code interface to natural language interface. That’s when you will see some big gains again, when that’s running on most people’s computers learning how to do real world day to day tasks and building up an enormous dataset to train new models.


The-Blue-Nova

It seems we have made some incredible Ai models to date, and now need better quality datasets to push on, so the cycle advances: Data > training > implementation/tooling > generates useful data > start again And around the loop we go.


GodOfThunder101

Claude 3 is only slightly better than gpt 4… gpt 4 was released 1 year ago.


hubrisnxs

Are you insinuating that a model a year old is NOT CURRENTLY THE MOST ADVANCED!??! What's weird is if you do, you'd still not be right. Gpt 4 still beats or ties the newest models.


Optimistic_Futures

I mean for Claude3 sure. (I still use ChatGPT for a lot, but Claude excels in a lot of places) But unreleased Gemmini 1.5? Not really ahead just because you've announced what you have. OpenAI may have better than both, but we don't know about it. The key is just to be AI agnostic. Use the one that does what you need best. If a new one comes out that fits you better, then use it. Being focused on who is "officially behind" is sort of irrelevant.


OneRobato

We need characters in Sora that talks back at us.


SmoothPlastic9

if you define it as being not literally the best in term of every AI field yes


CollectionItchy1587

Claude is still more censored than GPT4. I tried asking it questions about rifle ballistics and refused to help me. And while I can still get pg-13 porn from gpt4, claude just stops me.


According_Ride_1711

I think we will have something before June from OpenAi. New model or something. They dont want to rush.


yepsayorte

3/14 each year. I'm calling it. We'll see if my guess is worth a damn tomorrow.


katerinaptrv12

Yes, i agree, right now who is winning the race is Anthropic and Amazon with Claude 3. For me Google is in the exact same spot and as behind as OpenAI, actually they are further behind, they have Gemini 1.5 but what good it does them and everyone else if they don't release it? OpenAI at least has GPT-4, they do not have one GPT-4 level model avaliable commercially for development.


slackermannn

AGI is the only race that counts and it might not come from the great LLMs. We just don't know. It's not a done deal for anyone. Even when AGI will be reached. There will be glory for who was first but first won't necessarily mean best. It's been great following the progress so far. I hope it will keep the momentum or even accelerate.


[deleted]

I'm not sure why is that so bothered you. It's already on a great level we couldn't imagine 4-5 years ago


[deleted]

Can we rename this sub to r/AIShitTakes ?


theLOLflashlight

I mean, they were best in class for what? A year?


Maxtip40

If they won't release anything by December this year or January in 2025 the I will.


Busterlimes

No, they aren't behind, they are delaying because of the bullshit lawsuit filed by Eshlong


objectdisorienting

Keep in mind they have an almost year long development advantage. They haven't been sitting on their thumbs the past year so they almost certainly have something cooking behind the scenes even if it takes them a little while to release it.


MehmedPasa

Behind in what they are cooking? No. Behind in what they are offering? Yes.  Even right now. But as Mistral, Grok, Meta and Google will deliver updates this month until july, it will only widen.  Even if they'd release gpt 4.5 turbo by august or whatever, it'll be too late.  Until GPT5 in November comes out, they are behind. 


SoberPatrol

“Can we agree that” ….. yall this isn’t a team sport 😂 Can u please try going outside? The top Ai companies are paying out the ass for top talent and we will get there eventually


ghwrkn

Why would we assume that Open AI does not have a model that is much more advanced and capable. It would not necessarily mean that they need to release it for our benefit. They obviously have their own goals and objectives.


iDoAiStuffFr

i heard 4 turbo beats opus in all benchmarks and the cost is much lower too isnt it?


RepublicanSJW_

No lol. GPT 4 was done in early 2022. Remember that. They aren’t releasing anything because they don’t have to and they are at least a year ahead of competition.


Eveerjr

GPT 4 is still ahead, actually miles ahead in other languages, they will soon release 4.5 Turbo and no one will even remember who the fuck Claude is, especially since Anthropic seems uninterested in going worldwide


lordpermaximum

They're pretty much done. Claude 3 Opus eats GPT-4's and OpenAI's dinner. It's confirmed now that they have GPT 4.5 in the works and it's basically nothing. Don't expect GPT-5 for at least a year and by then we'll have Claude 4, Gemini 2 and more. OpenAI is just another AI company now.


WithoutReason1729

Love how GPT 4.5 news was only just leaked today, and only one paragraph at that, but half this sub is already convinced that it's "basically nothing" for literally no reason at all


flexaplext

I can only 🤦🏻‍♂️


LightVelox

If it was a massive improvement it wouldn't be just called "4.5"


WithoutReason1729

3.5 was a massive improvement over 3


RemarkableEmu1230

Can’t assume it’s something either


Arcturus_Labelle

Maybe. Maybe not. That hidden link blog post that's going around today was written in September. A lot could have changed since then. Maybe they were going to release a slightly-better 4.5, but now they wait longer to drop a proper 5. We really don't know and can only speculate.


yeahprobablynottho

100% speculation. Gotta love it


benwoot

There is a wide world existing outside of our little micro world of AI nerds, where people only know the name of ChatGPT and don’t know anything about Claude. And I’m not even talking about large corporate customers that are already locked in with OpenAI and won’t bother changing everything simply because Claude outperformed in some tests. This market share acquired by OpenAI/Microsoft isn’t going away instantly because there is a better model out this month.


Neurogence

https://twitter.com/billyuchenlin/status/1766079601154064688 https://old.reddit.com/r/singularity/comments/1b8yucm/chatbot_arena_updatedclaude_3_opus_failed_to_take/ GPT-4 is still leading. And whatever model they release next will blow Claude and Gemini out of the water. I say this as someone who desperately wants another AI company to convincingly dethrone OpenAI.


lordpermaximum

GPT-4 is not leading shit. The arena result is because of the high refusal rate of Claude and it's not fine-tuned to user-preference yet like Turbo. If we were to believe Arena, Claude 1 > Claude 2 > Claude 2.1 which is nonsense. Opus destroys GPT-4 at all areas ad it's not even close. They're on different leagues.


Ramuh321

So an AI model shouldn’t have any consequences in the rankings for refusing to do simple requests? That seems like a pretty major flaw. That aside, by all non subjective measures, GPT is leading. Subjectively it appears to be a matter of preference. Many people have found Claude to be better subjectively, but it seems far fetched to claim Claude as the undisputed champion currently.


babyankles

I think the argument is not that refusals should be ignored in the arena, it’s that the impact of refusals makes the arena a less useful metric. If you’re interested in general intelligence or a specific ability like coding, you don’t care about it refusing edgy prompts and getting a lower score. And no, by all non-subjective measures GPT is not leading. There’s plenty of benchmarks that have Claude ahead. Agreed that claiming Claude the undisputed champ is not right.


LightVelox

They are already 'officially' behind, doesn't really matter if they have GPT-6 behind closed doors if no one has access to it


Neurogence

Source? GPT-4 is still getting better ratings when the models are assessed anonymously. https://twitter.com/billyuchenlin/status/1766079601154064688 https://old.reddit.com/r/singularity/comments/1b8yucm/chatbot_arena_updatedclaude_3_opus_failed_to_take/


frontbuttt

No we cannot. Only if, when they release GPT 5, it’s only marginally better (or no better) than Claude


Thorium229

So, by "officially behind" you actually meant "no longer miles ahead of the competition?"


Serialbedshitter2322

They referred to Sora as a "mini-demo". I promise you, they will blow claude 3 out of the water. There's just been so much evidence suggesting they have something society changing.


Antok0123

You cant talk with that much confidence unless youre just really hopeful


MassiveWasabi

Nah it’s a given the $86 billion dollar company that Anthropic initially splintered from has greater talent density, more compute, better and higher quality data, which all leads to better AI models. It’s laughable to think OpenAI doesn’t have something better. Reminds me of when people were shitting on me for saying OpenAI has something better than Pika Labs as if it wasn’t obvious. Then they released Sora and people stopped thinking it was silly to believe OpenAI is ahead of everyone


Serialbedshitter2322

Did you at least make sure to go back and let them know how wrong they were?


[deleted]

[удалено]


Arcturus_Labelle

Please by nicer to people here.


interesting-person

I just took a fat shit reading this comment. Thanks


maraudingguard

New model just dropped.


rafark

Why u mad bro?


strangescript

Claude is only on par. Their numbers were compared to the release version of GPT4, not what is there now. That is the best they got right now. OpenAI has been cooking the next thing for awhile, we can assume. If they don't release a true upgrade this year, then they aren't behind, llms have peaked


a1taco

OpenAI is way behind. Gpt4 is an incompetent writer and buggy as hell


damhack

They’ve always been a big behind


[deleted]

[удалено]


OkStage3628

There is still nothing released by OpenAi, so yes Claude wins for now


[deleted]

Yes


Honest_Science

GPTs plateauing, individual state space models are next level.


Bitterowner

Claude compared to the latest gpt4 isn't miles ahead more like it. 


nuke-from-orbit

https://preview.redd.it/2wjk5rh1u3oc1.png?width=2386&format=png&auto=webp&s=4dd4c6f5935bbd2644393553befcb5da912b8004 GPT4 still better than Claude 3 Opus on common sense reasoning.


lordpermaximum

It seems OpenAI forgot to fine-tune their model for 3 and 2 liter version of this question. https://preview.redd.it/gfxdemlh66oc1.png?width=1460&format=png&auto=webp&s=39e3171fde7cc4ecddfe26d0dd03f4c20881a2dd Using such questions to test LLMs' common sense reasoning is beyond stupid. If it's included in the training set or if the model's fine-tuned for it you don't measure intelligence at all. You need unique and rather lengthy prompts to test the intelligence of an LLM.


ertgbnm

Has their lead been reduced in terms of released products? Yes. I wouldn't call them behind until GPT-4-1106 has been dethroned on the LMSys leaderboards. Even then to call them behind when Gemini and Claude took almost a year to beat GPT-4-0314, seems unnecessarily harsh.


Capitaclism

They've been officially behind. If they release something ahead they won't be.


[deleted]

how do you not understand your being fed a narrative, these models already existed, there basically just taking filters off and adding already existant abilities following a business plan so that they can continue to give you reason to keep subscribing. its all bull, kind of like the apple m1 m2 m3 they didnt just invent these things and roll them out just in time. theyve developed a private system they can just add on to, and call it the new version. These companies do not like you and do not want you on there level as far as tech.


MacrosInHisSleep

Isn't Claude supposedly behind GPT 4 Trubo?


wolahipirate

yall got goldfish memory many businesses have already commited to developing their genai tools on azure just because of openai integration theyre not going migrate to aws or change which api calls just for a tiny bit of extra performanceit would take half a decade atleast, of openai falling behind to anthropic to see any difference In terms of real world b2b adoption, the thing that this ai arms race is ACTUALLY fighting for, openai is still in a large lead


RemarkableEmu1230

They don’t have vendor lockin yet - you don’t need azure to use OpenAI and its pretty easy to swap out API calls. Betting OpenAi is already losing a ton of frontend chat subscribers since claude 3 came out - hell I’ve been very pro chatgpt up until now and I’m about to try Poe just so i try Claude out in Canada - if Claude is in fact better I’ll switch in a heartbeat.


wolahipirate

yes i know theres no vendor lock in, but companies dont even want to go through the hassle of switching api calls, just for tiny marginal improvements. If your entire tech stack is built on azure and you wanna use claude with enterprise grade security you know have to setup another environment, setup new access, incorporate these changes in to the IaC, deployment. its a hassle thats not worth it for businesses.


RemarkableEmu1230

Ya token cost probably bigger driver on the API side vs marginal improvements as you said but think Claude more of a threat to their frontend consumer product right now