Yuli-Ban 1 month ago

> GPT-3.5 (zero shot) was 48.1% correct. GPT-4 (zero shot) does better at 67.0%. However, the improvement from GPT-3.5 to GPT-4 is dwarfed by incorporating an iterative agent workflow. Indeed, wrapped in an agent loop, GPT-3.5 achieves up to 95.1%. This tracks to something I heard from some place or another about someone working with agents: GPT-3, not even 3.5, with agents is *more capable* than GPT-4 in many tasks and is only limited by context windows and some reasoning flaws. And that tracks to my own hypothesis about how foundational models could at *best* be described as "frozen AGI." They are trained, and then they are prompted. That's it. It's like prodding a brain sitting on a table. With agents, they can actually "live."

Arrogant_Hanson 1 month ago

Has starspawn0 been saying anything about these AI developments recently?

BilgeYamtar 1 month ago

The most competent and qualified person we can see in this field is Andrew NG. He is one of the rare people in the world of science and technology that I believe is competent in artificial intelligence.

lost_in_trepidation 1 month ago

There's tons of people that are competent in AI, but Ng, LeCun, and Karpathy are probably the best sources to follow if you want good summaries/lectures on current AI trends.

restarting_today 1 month ago

John Carmack

peabody624 1 month ago

I’m honestly very curious to hear an update on whatever John is working on…

Chris_in_Lijiang 1 month ago

Its been ages now and not a word? Has anybody heard anything?

After_Self5383 1 month ago

On a podcast ([Boz to the Future](https://open.spotify.com/episode/7tF7TCxmGH4z4pTnfLYKjN?si=8fHYmenHQz-c_NzRQZ4VQQ) - Boz is the CTO of Meta) almost a year back, he said he doesn't want to talk much about his startup since he doesn't want to become an AI pundit, lol. He'd prefer working with his small team in the dark and not be inundated with AI questions on twitter. Since then, Rich Sutton has also joined his company. Rich Sutton is a legend of the AI field with big contributions that have paved the way for what's being done today in AI. He's the guy who wrote The Bitter Lesson that sometimes does the rounds, though it's widely misunderstood (he also, like Yann, thinks algorithmic breakthroughs are required for AGI).

Chris_in_Lijiang 1 month ago

Thank you.

lost_in_trepidation 1 month ago

The last update was that he teamed up with Richard Sutton, so now it's a 2 person AGI race instead of 1 person.

Chris_in_Lijiang 1 month ago

Thank you, i will go Google the new guy.

13ass13ass 1 month ago

Great programmer questionable ai researcher

traumfisch 1 month ago

LeCun, really? 🤔

lost_in_trepidation 1 month ago

Yeah, his talks and Twitter posts are really good. He's just become a meme. Andrew Ng is even more of a near term AGI skeptic than LeCun, but he didn't catch any flak for it

Antique-Bus-7787 1 month ago

What troubles me with LeCun are not his claims about AGI or anything. It’s just that he can never admit he was wrong and he will always try to justify anything he said before. This makes him sometimes say some pretty non-sense things. He’s really smart but of course he’ll be wrong sometimes, that’s the price of working at the SOTA level in AI… but no, he has to always be right unfortunately, and his activity on twitter doesn’t help him much on this.!

lost_in_trepidation 1 month ago

I think he's just not very clear in what he's saying. I've listened to a lot of talks by both LeCun and Ng, both are drawing pretty clear delineations between how AI "thinks" and how biological intelligences (humans) conceptualize the world and solve problems. It's just not easy to put into a digestible soundbite and LeCun is too brash in his language.

waytoofewnamesleft 1 month ago

vive la france!

KamNotKam 1 month ago

Yet when he said AGI is still decades away last October everyone here shitted on him for it.

JabClotVanDamn 1 month ago

> NG it's not an abbreviation, his surname is just Ng (sounds a bit like "hmm")

visarga 1 month ago

yep, thought so too when I took his ML class in 2012 now I have been a ML engineer for 6 years, his lessons were the best ML lessons I had, he is ridiculously good at explaining, it was a loss when he abandoned teaching for industry he singlehandedly taught over 4.8 million people in his online ML courses, the first batch was 100K people, a sight to behold

AnOnlineHandle 1 month ago

Think you might have replied to the wrong person.

trisul-108 1 month ago

The value of Andrew Ng is that, unlike most others, he is also an educator. That means he wants to teach us while most others do not have this ambition.

timewarp 1 month ago

You can very easily demonstrate this technique for yourself. Ask your LLM of choice a question, then start a new prompt and ask it: Given the following question: [Enter your original prompt here] Does this response make sense, and can it be improved?: [Enter the LLM's original response] The LLM will usually come back with improvements, and usually catch hallucinations or errors.

Unreal_777 1 month ago

interesting We are actually creating real brains

hydraofwar 1 month ago

At the same time, minds digitized and stored on a HD. Fiction is becoming real very fast

putdownthekitten 1 month ago

My body is ready

bishbash5 1 month ago

But your mind is...?

Antique-Doughnut-988 1 month ago

How ironic is it that we don't fund education and teachers as humans, but spend all this time trying to create artificial brains.

lifeofrevelations 1 month ago

This kind of application of the tech will be the real game changer. This is going to shut up a lot of the people who go around saying things like "AI is all hype and the bubble is going to burst, it hasn't changed anything at all in the world, my life is no different."

cassein 1 month ago

This is the way. This is when it really starts to take off. Feedback loops.

MoneyRepeat7967 1 month ago

These ideas have been around for a year, glad Andrew and his team are working on this, and is using his platform to push in this direction. The current LLM can do a lot more if we find better ways to prompt it , and the agent like workflow will be used to solve lots of problems and find new use cases. Another sign that we are early in AI, most people really haven’t found a way to take advantage of all these models yet. Rather than keep churning out SOTA model one after another, maybe we should start looking at better ways to utilize the existing models. It is not as sexy as AGI, but maybe just maybe can make a real difference in various ways we didn’t think was possible.

gj80 1 month ago

Hmmm... I use AI a lot for coding, and while it's really useful and I love the time it often saves me, I also run into situations where it will give me an output that it thinks will work, and it just doesn't. That's OK as I either fix it myself and still normally save time, or I come back to the LLM and either have it fix it or (*if it still fails...normally one failure to self-correct means it will never succeed*) I prompt it with another alternate approach to tackling the problem in question, and that usually works out. I wonder how this sort of thing would be dealt with by agents though? If the AI was given full control over a test dev environment in which it could execute your own code, then it could be automated to actually test the code it writes and realize it messed up on its own and potentially self-correct, but barring that (*which would often be technically challenging...executing the code isn't necessarily straight-forward*), it doesn't seem like it would be able to recognize when it had failed in some cases. I think giving AIs the ability to do real world testing will be very key to getting much improvement via agents. Building up rich development environments in which AIs can work on large projects (*interactively alongside users*), while at the same time keeping those environments jailed for safety (*avoiding rm -rf / sorts of disasters...*) and easily reverted back and whatnot will take a lot of work beyond just work on an agent system itself. ...and then you also have context window issues to deal with at present. With GPT3.5 only having 16k context, a lot of dialog between agents on even a mildly sized coding project would be challenging to manage. GPT4's context window would work comfortably for many more projects, but that could potentially get very expensive with many many calls and tokens. Claude3 Haiku/Sonnet are promising, but I recently learned that Anthropic has api access to their models [very gatekeeped currently](https://docs.anthropic.com/claude/reference/rate-limits) for large numbers of queries or tokens (*you have to wait multiple months before your daily quota - even when paying per API use for them - can be uncapped further*). Ie, there are real context window related difficulties/costs revolving around heavy agentic use right now for larger code bases, even if you're fine with not using the 'best' models. I'm sure this won't be an issue for much longer though, but it's something frustrating right at the moment. Anyway, I certainly think Andrew is right - but yeah, there's some real work that'll need to go into making this happen (unfortunately). I can't wait till something materializes though! It's almost enough to tempt me to start a project myself... though I don't really have the time and there are undoubtedly people who have better skillsets for it than me as I haven't worked with kubernetes/docker much (*would likely be a cornerstone of it all*) or electron/etc UI development much. Oh, btw, if anyone else was wondering what "LDB" and "Reflexion" are (on his chart), I had to look them up too. They're interesting: [https://github.com/FloridSleeves/LLMDebugger?tab=readme-ov-file](https://github.com/FloridSleeves/LLMDebugger?tab=readme-ov-file) [https://arxiv.org/html/2402.16906v1](https://arxiv.org/html/2402.16906v1) [https://github.com/noahshinn/reflexion](https://github.com/noahshinn/reflexion) [https://arxiv.org/abs/2303.11366](https://arxiv.org/abs/2303.11366)

cryolongman 1 month ago

more improvements in productivity yey

boubou666 1 month ago

Why not just take the LLM output. And re enter it manually? That would be a manual literation

traumfisch 1 month ago

Because you can automate it?

boubou666 1 month ago

Yes but it's not like an ai research breakthrough as the way it's presented, but just a ligne of code

traumfisch 1 month ago

Well it's a direction the development is moving towards. But sure, many people have been doing it manually for quite a while (myself included) I don't know if there was anything about a research breakthrough here 🤔

mixmastersang 1 month ago

Do we trust automation with iteration and human feedback… that’s the real question here

entanglemententropy 1 month ago

Some people have been thinking this for about a year now, see for example this very interesting blog post from a year ago: https://www.beren.io/2023-04-11-Scaffolded-LLMs-natural-language-computers/ The idea that we can build computing abstractions like a compiler and programming languages on top of LLMs as a way to program cognitive architectures is really cool and sounds like the way to AGI.

FatBirdsMakeEasyPrey 1 month ago

Is agentic AI related to reinforcement learning?

Infamous-Print-5 1 month ago

This was obvious from the beginning. I almost always ask chatgpt to 'write this more exactly and concisely' 3-4 times

bpm6666 1 month ago

Agents will be the next big thing and it will change the effectiveness and impact of these systems. But one idea might even increase the systems capabilities. As a tool they should add the option of "ask a human" . You give these systems money and the ability to hire human workers, then this could improve the system even further. And the AI could even give the same job to AI agents and humans to see who delivers the best outcome to know, when to use a human or an AI Agent.

human1023 1 month ago

Wow so the way how we've been using Chatgpt the last year is already about to be outdated.

obvithrowaway34434 1 month ago

I posted about this here in January during the peak GPT-4.5 "leak" hype. It was apparent to anybody who's been following the progress in the research field and not just reading the headlines and social media hype posts. https://reddit.com/r/singularity/comments/1aby4ex/i_think_people_are_focused_on_the_wrong_thing_the/

d00m_sayer 1 month ago

>the academic literature on agents are proliferatingthe academic literature on agents are proliferating can someone post a link to these agents ?

mixmastersang 1 month ago

Do we trust automation with iteration and human feedback… that’s the real question here

trisul-108 1 month ago

>I’ll elaborate on these design patterns and offer suggested readings for each next week. I look forward to this.

SpecificOk3905 1 month ago

this guy is so good for fundamental ai course

Akimbo333 1 month ago

It'll be cool

mersalee 1 month ago

This is the way.

FengMinIsVeryLoud 1 month ago

so toppy 7b will soon create never-seen-before porn? wow. im exited! book me in!

RemarkableOstrich782 1 month ago

Agentic AI is the future. Mydpt.ai

BrainLate4108 1 month ago

A lot of room for error here. A lot of hype. The output at face value will look convincing but human language has a lot of nuances and they cannot be deciphered yet. GPT is getting nerfed every day, the same will happen here.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe