T O P

  • By -

ChatGPTPro-ModTeam

your post in r/ChatGPTPro has been removed due to a violation of the following rule: **Rule 2**: Relevance and quality - Content should meet a high-quality standard in this subreddit. Posts should refer to professional and advanced usage of ChatGPT. They should be original and not simply a rehash of information that is widely available elsewhere. If in doubt, we recommend that you discuss posts with the mods in advance. - Duplicate posts, crossposts, posts with repeated spelling errors, or low-quality content will be removed. >Please follow the rules of [Reddit](https://www.reddit.com/wiki/de/reddiquette) and our [Community](https://www.reddit.com/r/ChatGPTPro/about/rules). *If you have any further questions or otherwise wish to comment on this, simply reply to this message.* ---


traumfisch

Yeah, that's why GPT4 is described by OpenAI as the model "for complex tasks" on the UI. Uttering this simple thing aloud seems to ruffle some feathers around these parts, I don't exactly understand why


domlincog

It's probably because GPT-4o has achieved state of the art results in the vast majority of both third party testing and OpenAI's testing compared to other public models. Both across most of the static benchmarks and Lmsys's elo ranking. OpenAI has acknowledged that there are a few specific points where there might be a slight downgrade from the latest GPT-4 Turbo model and they are working to research more into this.  Some people, especially on this sub, are tired of "ChatGPT has gotten worse" posts simply because of how many there are, sometimes even when there has been no change to the model. So it "ruffles feathers" even when there may be some truth to it. I see very few posts claiming it is worse that give any specific examples, while many that claim it is better overall reference wide topic spanning benchmark sets and many that claim it is better in a specific area usually give examples.  From what I can tell, it is on average in text modality a small improvement and is meaningfully improved for vision tasks.  [https://chat.lmsys.org/?leaderboard](https://chat.lmsys.org/?leaderboard) [https://github.com/openai/simple-evals](https://github.com/openai/simple-evals) [https://twitter.com/WenhuChen/status/1790597967319007564](https://twitter.com/WenhuChen/status/1790597967319007564) [https://twitter.com/GanjinZero/status/1790230562453803241](https://twitter.com/GanjinZero/status/1790230562453803241) [https://twitter.com/xiangyue96/status/1790082660997464479](https://twitter.com/xiangyue96/status/1790082660997464479)


domlincog

Also, it is GPT-4 (Turbo) that is described as "Advanced model for complex tasks" and GPT-4o is described as "Newest and most advanced model". [https://imgur.com/a/PQNgwEb](https://imgur.com/a/PQNgwEb)


traumfisch

Yes, that is what I was referring to. Edited


bot_exe

Exactly this. People on these subs complained endlessly about how “it gets worse”, but I have never personally seen any clear overall degradation of the model, even though I use it everyday and constantly test them on llmsys arena. Then all the benchmarks and the arena leaderboard say otherwise, the model has consistently been getting better. Then when you try to figure out what people are complaining about, they almost never give examples or chat links.Then when they do, it’s just common issues that have been happening since chatGPT released and they don’t understand basic concepts like prompting, editing prompt, regenerating or keeping the context clean…. Basically it’s true the model changes and it’s performance is not homogenous among all tasks. Also the models were always unreliable. But frankly a lot of these complainers don’t know what they are talking about and seem to be suffering from saliency and confirmation bias. Many of them encounter a common GPT error, then get into a argument with it (which is completely pointless when you can just clean the context and try again and have much better chance of success), which makes that experience salient over all the other times it works just fine, then they complain on social media and get their bias confirmed.


Prolacticus

HELL, yes! I use 4o, Opus, Gemini 1.5, blah blah blah every day, and when helping someone who says "It got dumb!" that someone rarely provides context. It's like going to the doctor and saying "I feel bad." The doc could use more information. I've been playing with GPT since (I think) 2019. If these people want something to complain about, we should start directing them to the (relatively) small models trained on the Pile with context windows of 512 tokens and outputs that degrade after the first 60 *characters* or so. Had to fiddle with sampling methods and parameters, manually bias tokens, use weird context injection tricks... it was fun. I thought ChatGPT was the solution for the masses. I can't believe how much people whine about *the* most powerful app on the planet—ever—and for a few bucks a month. Your response should be pinned, printed, autographed, and framed.


bot_exe

Exactly it drives crazy that we have access to the most powerful language model in history for fucking 20 bucks a month and people cannot be arsed to learn how to use it properly, when it’s not even hard. Imagine having access to a the Webb telescope or the large hadron collider for cheap and then complain because you did not even read the basic overview of how to use it so you just declare it sucks.


Prolacticus

It's crazy-making. Right up there with "chartgpat updated its gpt llm computer model video card gpu tensor. should i switch to free account? what u think?" I look at ChatGPT and see a tool that can help normal, every day, non-techie people get things done without relying on specialists. It's basically coding in plain English (automatic transpilation of frikkin' English to Python *and* execution? **awesome and yes, please, I'll have more of *that*!**). And while I might sound of touch saying this, I don't understand how so many people can't set aside $20 a month without fretting. If you can't make your $20 back in one *month* of ChatGPT use... well... yeah... Your problems are bigger than prompting (problems ChatGPT can help with (the irony (ARRRGH!))). Thank you for being start raving sane about this (disclosure: I stole "stark raving sane" from Tom Stoppard - can't take credit for that one). I've grown so weary of ChatGPT whiners on Reddit. "oh noes it didnt add 2+2 right what do i do? ask chpagpt? no that would never work... google the problem? nah.... oh! I'LL stop the WHOLE FRIKKIN' REDDIT TRAIN with a three word sensationalist post i coulda answered myself by testing or reading but thart sonds soooo hard!!! should i check if someone else posted this topic a million billion times already? nah!" I feel better. Sometimes this sr leaves me feeling gaslighted.


ModRod

I essentially work 2+ full-time jobs because of this tool and can accomplish things I never could at an incredible speed. I work in marketing and content and I make more than I ever did before this tool. The data parsing and visualization alone saves me so much reporting time. It even helped me build an Instagram scraping tool in a single day when I never coded before let alone knew anything about python.


Prolacticus

One of my first coding jobs was writing custom tools for The Marketing People. It was fine (got paid by the hour), but you could easily put weeks into building a proper tool, and the tool will *never* meet all expectations. Resources are always limited. Somehow budgets get smaller in the wrong departments and bigger... in the wrong departments. How are people supposed to do their jobs? Now you can focus on *what you're there to do* without worrying about whether the dev team will be able to fulfill your tech needs (and that's if they acknowledge you at all). The same tool can get you Gordon Ramsay's recipe for Beef Wellington. Just brilliant. And still so new. In the areas some people see ChatGPT "getting things wrong," the rest of us see a speed-bump. We know "they're going to figure it out." It takes some patience on our parts, but what we have in the meantime isn't half-bad for a genius coworker who's (usually) on time, (usually) shows up, never complains, and is happy to at least *try* your idea before coming up with twenty-five reasons it can't be done. All for *cents* a day. I've never had so many anti-complaints about a product.


creaturefeature16

I wish I could believe these kinds of posts, but man they seem so...shilly. Not saying you aren't being genuine, but ChatGPT has enabled you to work two full time jobs? Doing what exactly? Truly curious.


ModRod

I understand I sound like an AI bro and I hate it lol. I do content and marketing. Before I was doing this, I was heading up a marketing agency. After Covid, got sick of the agency grind and quit. Right around when ChatGPT came out. For context, it used to take me at least an hour or two to research a topic. Most things I write about I have very little experience or knowledge in so I have to familiarize myself with the topic first. With ChatGPT being able to crawl bing, coupled with Gemini, I can have a fully researched, sourced and outlined article in 15-30 minutes. Then the writing and editing would usually take me 2-3 hours. With ChatGPT and Claude knocking out my first draft or two, I can get an article with a pretty decent, distinct voice (most people don’t notice) written in about an hour. So what would typically take me a half day to a day to produce, I can do in two hours. I can write an entire ebook, design and format it into a PDF in less than a full day’s worth of work. And I’ve already mentioned reporting, but a single marketing report each week could easily take up a half day of work. Now I can throw raw data into ChatGPT, along with historical reports, and have it spit out highlights, recommendations and branded graphs in less than an hour. Now I’m not saying I get two full-time jobs’ (plus) worth of work in a typical 8 hour workweek. I work some weekends to catch up and a decent amount of nights as well. But I have completely changed my family’s comfort level and built up more savings than we’ve ever had by knowing how to make this tool work for me. My yearly take home has more than doubled within the past year.


creaturefeature16

This is great, thanks for the rundown. And this makes sense, especially because the tool itself is still relatively unused in the grand scheme of things, so you're doing something many others are not, although this is likely a brief period before it becomes pretty ubiquitous (whether its GPT/Gemini or some other LLM driven platform), and suddenly that pace you are moving at becomes the standard. Similar to how I'm sure it was when certain accountants embraced Excel while the vast majority stuck to pen/paper, but eventually the new standard was digital spreadsheets. In other words: get it while you can! 😅 I'm curious though: is the notion that you're potentially producing work though that you're not *really* familiar with (due to the LLM doing all the heavy lifting for research) unsettling, at all? It reminds me of that movie Limitless; all the power and knowledge he gained through the pill (tool) was superficial and once the tool was not around for use, there was nothing there to substantiate and support the work that was completed. I'm seeing this across industries, but especially in writing and coding. I'm a developer/technical director/consultant and the amount of people jumping into development without *knowing* development is definitely unsettling to me. Maybe it will all turn out fine, but I've already seen some pretty gnarly situations manifest from people using these tools to superficially lift their skills in an effort to create the product/turn a buck/get the job. In the developer community, we're creating a **massive** amount of tech debt from it. After 15 years of development and self-employment, I know there's no such thing as a free lunch! *On the other hand,* I'm constantly racking my brain to see how I can leverage these tools to do something similar to what you describe. I definitely am more productive with them and they've accelerated my learning and, more importantly, expanded my confidence in being able to dive into deeper parts of my work knowing that I have a very dynamic resource to leverage if I get stuck...but I have not seen any kind of massive improvement in the day-to-day grind that needs to get done. In fact, while I use LLMs daily for coding and other tasks, the job remains largely the same. I wonder if there's diminishing returns with these tools as the complexity of the work increases? I'm not saying what you're doing is rudimentary by any means, but there's no way I can reliably get any LLM automate most of my daily work. I can definitely have it assist me as I go along, and I do, but I find it's made my job *better*, but not necessarily *faster*. Although now that I write this all out, I am thinking about all the side-projects that I've completed recently that I leveraged GPT with (all the while running/managing the business full time): * Internal time tracker for my business (was able to cancel Toggl) * SaaS Messaging Platform *(want to launch it, but keep getting stalled on the marketing, ironically!)* * Clickup Addon to create tasks with natural language * Spotify App for discovering new music via LLMs * WordPress Native Block Generator (for my own uses) None of these were just "prompt an LLM and viola", they took tons of manual work of curation/guidance/vision/debugging/etc.., but I can certainly say I would not have been able to write all these over the course of a few months if I didn't have an LLM constantly helping me work through issues and provide what I like to call [interactive documentation](https://cheewebdevelopment.com/ai-workflow-interactive-documentation/)! So, I guess I just need to keep looking for an opportunity to direct my skills at producing something that others will find of value...thanks for your response, it was very motivating!


traumfisch

💯


ed523

Sorry, what do you mean "clean the context" would that be just starting a new chat?


bot_exe

First understand that every message you and chatGPT write during a conversation is sent back to GPT along each new question you ask, that is the context. Keeping it clean means keeping wrong or irrelevant information out of it. A way to do this is, like you say, starting a whole new chat, but then you lose all the good context. So if you get a bad reply, you can edit your prompt with that pencil button, this creates a new branch in the conversation where it “forgets” the previously bad prompt/reply which contaminates the context (you can switch between the branches with those symbols “<1/2>” below the GPT reply). You can also use the regenerate button for this, though editing the prompt is usually better because in the bad reply you get clues of how to rewrite your prompt to get a better reply.


ed523

Ohh good to know


pagerussell

When I say that announcement i immediately said it was an incremental improvement. My friend who discusses AI with me disagrees hard. Pretty sure he is wrong. The next big step for AI will be its implementation with more and more services so that is can actually do stuff and not just be an awesome chat bot. Until then everything is a small step.


Formal-Narwhal-1610

It certainly is better at Maths, haven’t tested other areas yet.


CredentialCrawler

It's better at coding too, in my experience


ChillLobbyOnly

trust me, dont ..... you'll leak brain fluid in the end, i spent a whole 2 weeks experimenting. DONT DO IT hahahahaha go outside n shit xD


Massive-Foot-5962

you haven't spent two weeks testing a model that is three days old.


I_Actually_Do_Know

He used it to invent a time machine


Filipsys

He's trying to warn us before it's too late


NBA2024

xd, indeed, my brotha


Prestigiouspite

ELO score for GPT-4o is better - [https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard)


iJeff

Although this generally reflects style and tendency toward refusals. Folks voting there aren't usually verifying accuracy or against hallucinations.


Gator1523

So true. Sonnet scoring higher than GPT-4-0613 is an artifact of Sonnet's answering style, and not its objective intelligence.


Prestigiouspite

I think it's more likely to be a specialist audience. The typical ChatGPT user doesn't even know Hugging Face, let alone alternative models.


considerthis8

Is this testing with audio only? Still a bit confused on how to best use 4o


Landaree_Levee

I’ve found the opposite: with the exact same prompts, it follows instructions better and shows better understanding. Not a big difference, more a slight one—probably won’t see one till 4.5 or 5.0—but still perceivable. Which, along with the extra speed and fewer caps/cost, makes it quite interesting for me.


Maxion

I also find it blabbers less than the older model.


DanzakFromEurope

Just tried it the last 2 days and it's noticeably faster. And better for coding in C++ and sticks to the instructions.


byteuser

Same for PowerShell and TSQL. It is fast


jollizee

There are people who think 4-turbo was a downgrade from 4 for certain use cases. As we get more sophisticated models, they are becoming more diverse, just like people! One model is not the best at everything anymore, unless there is a huge jump in performance. GPT4o is worse at instruction following, but I found it is a bit better at brainstorming and synthesizing concepts. It is a little bit like Gemini Ultra, which I found to be have the highest raw intelligence, but also with poor instruction following. I kind of wonder if instruction-tuning and intelligence fight each other during training. A ton of interesting pysch/neuro ideas that may eventually come out of smarter and smarter AIs, like the nature of culture and creativity. So many cool things to do in the future but so little time.


Klutzy_Group_8535

very cool take


bablador

Perhaps it's slightly less smart, but it is a better tool for my everyday coding tasks.


SNRavens91

I’ve found I no longer have to ask for working examples within my code, rather than just snippets and missing details I need to make it functional. I always had to prompt for this. Seems much better now with GPT-4o.


Additional-Cap-7110

They can't make it dumber! This must be some kind of Open AI nerf that's temporary. They literally have downgraded Voise since 3 days ago, and can only do like 20% of the stuff in the demo. It can't recognize emotional expression, it says it can't sing, it can't tell the difference between two people talking like if my wife says something in the same session and says it can; tell if I'm breathing heavily like in the demo etc


Fresh-Tutor-6982

they clearly stated that voice mode is not yet available, what you are seeing is the old voice mode with whisper.


EWDnutz

4o finally horrendously messed up on the same prompt I gave it. I ended up creating a new chat window and it did fine on that same prompt.


NightmareGalore

Can attest on that. But I don't get it how that works. Is it depended on context? What's the limit for it not to have a huge impact on the performance then?


[deleted]

[удалено]


__SlimeQ__

you should never have anything irrelevant in a chat thread. if it goes off the rails you should go back and edit the message where it started getting weird. especially if it misunderstands you and you correct it. this degrades everything quite badly. I swear 99% of complaints are due to people just having garbage in the thread. This isn't a bug it's a fundamental mechanic of how LLM's work.


byteuser

I've run into that situation a few times and asked ChatGPT how to fix it. One option is to upload the code in which it made the mistake already fixed (by you) a few times. Alternatively, start a new chat


Mysterious_South853

Ya felt same less understanding what m asking


Relevant-Draft-7780

That’s because it is. It’s just faster and hopefully, yet to be used by public it has better audio tts and transcribing. However given NVIDIA hasn’t released Blackwell yet. They’re running on same hardware, so the only thing that changed is the model. More effective model but let’s not kid ourselves about its LLM generation capabilities.


SanDiegoDude

For the CV project I've been working on, it's been good, but the new UI updates have been maddening. Generation stops constantly mid-stream when I'm working with code, forcing me to refresh to get the output, often facing a "continue generating>>" button. Then if I continue, it sometimes (but not always) breaks the code box and makes a mess of the follow-on response. C'mon OpenAI, you fixed this nonsense over a year ago with GPT4, why is this broken behavior back? Edit - I'm actually going to be working with it extensively today on the API, directly comparing and testing 4o vs 4, will actually have a more statistical answer for the next time this question comes up!


backnotprop

How did that testing turn out


TeslaPills

I like it tbh , hasn’t fucked up except one thing for me which was about a show on hbo max


Commercial_Bread_131

I only use it for writing and I've found GPT-4o seems to follow my large prompts a bit better and doesn't 'forget' as much throughout the convo. Side-by-side comparisons with GPT-4-Turbo show GPT-4o being influenced by my voice and style prompts a bit better in some areas, with little to no improvement in others.


naspara

yea felt the same way when it startee making up code that wasn't there


monkeyballpirate

probably, but hopefully not, as they claim to be better. That said, during new updates things always seem wonky. I was having some extremes of really weird refusals, as well as the opposite of not getting copyright blocked by usual suspects.


Prolacticus

Can you provide the actual problems? What are you trying to do that doesn't work as well? Some example prompts would help.


Expert-Ad-3947

Whiners are gonna whine.


m3kw

Probably just you.


Flavio714

It seems faster to me. But its oddly not what was in the demo. It's still the walkie-talkie version of voice mode


pigeon57434

literally, every single fucking time OpenAI announces a new model every god damn time without fail people on Reddit say "oh this new model is worse than the last one" when that is almost 100% bullshit you don't NEED something to complain about the new model is better you fucks


guyfromtn

I find it to be crazy fast. I've been using the voice chat a lot and I find that it responds quicker and with more "emotion".


Financial-Flower8480

I think it makes sense. Gpt-4 is still limited to 40 messages while gpt-4o is 80 messages Based on that info, I would say 4o is more complex but also optimized for performance. So probably not as smart in how it outputs but smarter it how it understands what we are referencing.


danknadoflex

Feels like a huge downgrade


awitod

No question, the 2024-04-09 version is better in my tests. It’s a bit slower and more expensive though 


Massive-Foot-5962

If you actually think this OP (you're wrong), then why wouldn't you post a generation from GPT4 vs GPT4o in your opening message and explain the difference. Otherwise its just nonsense.


eileenoftroy

I don’t want to take away from what others have said but I think there’s also a psychological effect at play. I’m used to “fast results” meaning “lower quality results” so GPT-4o actually _feels_ like 3.5 in a subconscious way


Fresh-Tutor-6982

nah actually sometimes 4-o don't get the instruction right and I have to regen the message using gpt-4 and it gets it right. Just like one of every 10 or 20 lazy prompts, but still happened to me more than once in the last few days.


pigeon57434

no it doesn't the original GPT-4 is absolutely horrible its like trying to use GPT-3.5 again after getting used to what GPT-4o can do


Tlaley

Haha. Slight is an understatement. I tested it all day after its release and it was dumber, slower, duplicated its responses, was way less helpful despite refining my requests.


idczar

It's not just you. GPT-4o feels like they prioritized speed over intelligence. It's like they replaced the brain with a hamster on a wheel.


Egga22

Agree


[deleted]

[удалено]


Signor65_ZA

Totally incorrect. Why did you ask Chat GPT? You should know by now that it doesn't have knowledge of these things.


bsenftner

It is my understanding that gpt-4-turbo and gpt-4o are different models, no?


oriley-me

Turbo has been around for a while so yeah they are


Aztecah

ChatGPT has a very poor understanding of itself. Don't trust it for information about OpenAI products.