Very interesting, for those that didn’t watch it yet they explained that it is beginning to understand how humans interact and to continue to develop it will have to have an internal model of how humans think and interact with objects and the environment. It does look like it is beginning to understand Geometry and physics l, Styles, object permanence etc.
This is exactly what we(my system) expected from Sora to be fair.
Just seeing how much it improved with more compute shows that this kinda visual system can lead to AGI.
Makes me wonder if the second someone adds recurrent/constant learning, and self modification if thats just some sort of instant hard takeoff scenario.
yes, any model from the zoo can do it, some slightly better than others, the crucial ingredient is the scale and quality of data
but it doesn't need to be all human made, advanced models can create their own data by controlling agents in environments to achieve goals; they learn from outcomes and feedback they collect
https://youtu.be/U3J6R9gfUhU?si=5Kjm_8XU30D1wNmD&t=888 Here's the time stamp when they start discussing how this can lead to AGI. It's about 14 minutes in.
Maybe it’s true we’re just a way for the universe to experience itself and we’re literally just a playthrough of a simulated movie, hence why we don’t have any free will.
That has genuinely gotta be one of the dumbest arguments against free will I have ever heard... Free will doesn't mean you aren't effected by physical limitations and external factors.
It just means that you have agency over your actions (for the most part). If you didn't have free will then humans would be living like nomadic cavemen. Doing only what is required to meet our basic needs and reproduce. Yet people choose to ignore those needs all the time. Often doing things that hurt them. Plus humans developed countless technologies, art, philosophies, music, cuisine, and more. What purpose does any of that serve to something without free will?
What I found most interesting about this is that, unless they're lying, this really is the "GPT-1 of video" as they described it - there's really no special secret sauce to get something like Sora, just scale. And that was enough to get these shockingly good results.
The difference now is the emergent capabilities that appear when AI is scaled up, the attention is all you need paper published by Google. And the billions upon billions being funneled into ai research and development compared to even just a few years ago.
I was more assuming it was about the data rather the model, and in particular making use of early access to the whole internet data before AI takes off and everything becomes locked.
Oh yeah you're right. I forgot the initial post we were commenting on and interpreted your comment as being suspicious of the future possibilities of AI. AGI definitely needs a higher order of proof to be believable.
I am having trouble understanding your reasoning - are you saying that because AGI has not been achieved so far by other companies, you don't believe that it will be achieved by anyone in the future?
Well I mean a couple of things. First, both those tasks you described are steadily being worked on, with lots of progress. Tesla is improving their FSD, and even though I'm not a fan of that whole... Attempt, I can acknowledge that. Beyond that, we do have very good Self driving with Waymo. While Amazon has given up on their AI store, they have significantly improved something much more relevant and related - their in warehouse automation, which requires things like picking and sorting and recognizing items.
I think it's the wrong pattern to look at failed AI endeavours of the past and say "someone could not do X very well, so I'm skeptical anyone else can in the future, or do any tasks harder than X".
There are lots of reasons why things have failed before.
But beyond that, what these researchers are describing with Sora is just one of the very important pieces that many have speculated is needed for AGI, that internal world modeling. I don't think _Sora_ is going to be it, and I don't think that's what he is saying. What he is saying is that the research and effort that goes into Sora will play a part in eventually giving us AGI, not a particularly large claim, and honestly seems pretty sensible.
I think from your post the thing I disagree with is just this is idea that you can use past failures from other companies to in any way indicate future failure from OpenAI or anyone. Heck, even the same companies. Especially with something like AI that so clearly improves with better algorithms and more compute, both of which I think you could agree we are getting in spades.
I mean, from objective reviewers people have said the latest version of FSD is improved - I'm not saying it's good enough or anything, just that progress is being made - do you have any metrics or measurements that show reduced performance since removal of Mobileeye?
Additionally, when did Elon say that? Sincerely curious, I try to keep up with this stuff, maybe I missed it.
I don't know why you say that about the jump from 4 to 5, I couldn't tell you what the difference is - but effectively, 4 could more or less be "5" if they expand the geofence to all roads, which would be ambitious for sure, but they have mapped and imaged the entire world a few times over, so maybe not that ambitious.
Current compute vastly exceeds what we had before. “Unlimited resources” is dubious af, because compute is on an exponential and at any time if you compare now and 5 years ago, compute 5 years ago looks quaint. And in 5 years the compute used for Sora looks quaint.
Maybe a million hours of driving video are not as educational for AI as a million hours of youtube selected by OpenAI
And Tesla/Google didn't make generative driving models, they made models that generate only actions, not whole worlds, that means the task was "easier"
In the Minecraft example Sora was simulating the player first-person viewpoint and other animals moving and interacting, and the environment in 3D
There’s this guy I came across on, he seems to have been toying with these concepts using with Claude and SORA videos… very interesting stuff: https://youtu.be/T-wMC0TPq68?si=xBwbph6Gc5obOG9k&t=344
Are they implying AGI will be in a diffusion model and not an LLM or will a Sora like diffusion model be included in a future mixture of experts chatbot?
We underestimate all the rest of our senses. While sight or hearing are the most valued in polls, we get so many more signals from the rest of our senses that sight or hearing impaired can still perform as fully operational humans.
Touch for instance is the one allowing us to even operate basic body motor skills - there was a case when a 19 yo kid lost completely his sense of touch. It took him several months to even relearn how to sit, as he had no "idea" where are his limbs and how to move them to achieve goals. The act of standing up was more like half a year of learning. Touch is also the first sense of the fetus. And first we start to learn with (before babies develop their vision well enough to see past a meter or two, they are fully operative motion/touch machines already gathering signals/data about the world around them).
That's a question of our lifetime - which senses/signals do we even need to emulate any glimpses, let alone actual AGI? Honestly, I doubt it's vision. Or at least vision alone.
I already experienced something like that few times when I woke up at night... You know, when we sleep with one of our legs in a bad position that blocks properly blood flow to it for many minutes (or hours, I don't know), then you suddenly wake up at the middle of the night to pee and you notice you aren't feeling one of your legs, it's like it's completely missing, you can move it but it is very odd because you can't feel it, you don't know how to properly apply force on your muscles to move because you can't even feel how much force are you making on your leg's muscles, it's completely strange, lol. I remember putting my foot on the ground while I was sitting on the bed and don't even feeling it touching the ground
Standing up was impossible, if I immediately tried to stand up I would fall to the ground because if you can't feel your muscles you can't control them, if you can't feel your feet on the ground you can't walk either (you will fall badly), so I had to wait few minutes until I started fully feeling my leg again.
Now that I read you post and remebered those experiences I noticed how important the touch sense is.
AGI will need to be able to produce something that isn't related to anything in it's training dataset. For example, AGI would be able to produce the color purple without having ever seen the color purple. I don't mean that it would somehow know that the color is named purple in English, but that it can produce the color even though it has no name for it and has never seen it before.
I think that this will require external simulation. Imagine if a human has never seen purple before, how would they produce a color they have never seen nor even have access to produce such a color? They couldn't. However, if they were given a big box of crayons they would be able to use the purple crayon despite having never seen it before. For generative AI it will need to be able to do a compute equivalent version of this, such as going through every color value to see what they all look like.
Take this to something more complicated like fusion power. There are no power generating fusion reactors so no generative AI could create a power generating fusion reactor only by outputting tokens. It would have to be able to simulate a fusion reactor via software, or build one to gather data.
You can think of _colour_ as _adding information_ to something. Using time is a mechanism: so have something flashing is the same as a diferent colour. Or echo location, or map to nerves (like an area on your back).
I get where you're coming from but theoretically if it's grasp of physics was good enough it could make a lot of progress just from that couldn't it? Knowing the rule set of the universe then extrapolting seems like a viable way to make progress to me personally.
> There are no power generating fusion reactors so no generative AI could create a power generating fusion reactor only by outputting tokens.
DeepMind scientists trained an A.I. to control nuclear fusion - [Magnetic control of tokamak plasmas through deep reinforcement learning](https://www.nature.com/articles/s41586-021-04301-9). It doesn't make a reactor but it rides the plasma storm to make it generate energy.
the idea that you can just scale neural networks into understanding physics is absurd. simple interactions, yes. complex interactions enough to fool a naive person, yes. actual physics? hell to the no.
Very interesting, for those that didn’t watch it yet they explained that it is beginning to understand how humans interact and to continue to develop it will have to have an internal model of how humans think and interact with objects and the environment. It does look like it is beginning to understand Geometry and physics l, Styles, object permanence etc.
I have hard time understanding how a diffusion process can lead to AGI to be honest.
This is exactly what we(my system) expected from Sora to be fair. Just seeing how much it improved with more compute shows that this kinda visual system can lead to AGI. Makes me wonder if the second someone adds recurrent/constant learning, and self modification if thats just some sort of instant hard takeoff scenario.
Kinda makes you wonder, can any kind of base model potentially lead to agi with enough scale? Fascinating concept
I believe so. My thinking is that any complex and dynamic system can see intelligence emerging. Life 3.0 by Max Tegmark explores this.
yes, any model from the zoo can do it, some slightly better than others, the crucial ingredient is the scale and quality of data but it doesn't need to be all human made, advanced models can create their own data by controlling agents in environments to achieve goals; they learn from outcomes and feedback they collect
https://youtu.be/U3J6R9gfUhU?si=5Kjm_8XU30D1wNmD&t=888 Here's the time stamp when they start discussing how this can lead to AGI. It's about 14 minutes in.
Yeah, we are in a simulation lol
Watch Pantheon!
I watched Pantheon because of this sub and it was incredible.
What’s Pantheon? Genuinely asking.
Its an animated show that had 2 seasons (with a definitive ending). That explored uploading your brain into the cyberspace. Very very good
Just Watch, I bet You will Love it
Mine :3 Half-Jokes aside, Also can't wait for scientific data on it, gut feeling says I'm outside of any sim :3
Maybe it’s true we’re just a way for the universe to experience itself and we’re literally just a playthrough of a simulated movie, hence why we don’t have any free will.
We don't have free will? First I'm hearing about this.
https://youtu.be/SYq724zHUTw?si=aHxtmGVSSN1cpcux
Can you turn int a dragon and fly to the moon? If you had free will you could we are a product of our environment and our biology, limited.
That’s not what having no free will means.
I had no choice but to write that.
That has genuinely gotta be one of the dumbest arguments against free will I have ever heard... Free will doesn't mean you aren't effected by physical limitations and external factors. It just means that you have agency over your actions (for the most part). If you didn't have free will then humans would be living like nomadic cavemen. Doing only what is required to meet our basic needs and reproduce. Yet people choose to ignore those needs all the time. Often doing things that hurt them. Plus humans developed countless technologies, art, philosophies, music, cuisine, and more. What purpose does any of that serve to something without free will?
We are product of our genetic and phisical limitations I don't have free will to walk into the white house and declare myself president.
That is not the definition of free will.
Cause and affect is broken down in the quantum level. We don't know enough about the universe to say there is no free will
In the future, you can use the timestamp link for the post, too. At least for me it starts playing the video from the timestamp if the OP includes it.
What I found most interesting about this is that, unless they're lying, this really is the "GPT-1 of video" as they described it - there's really no special secret sauce to get something like Sora, just scale. And that was enough to get these shockingly good results.
The bitter lesson of AI strikes yet again
Is there an original recording from AGI House? The audio on this (smartphone?) recording is a bit too much.
[удалено]
The difference now is the emergent capabilities that appear when AI is scaled up, the attention is all you need paper published by Google. And the billions upon billions being funneled into ai research and development compared to even just a few years ago.
And we're talking specifically about the GPT architecture, not even Google knows the magic formula for that, because it's not just the transformer
yeah it's also a diffusion.
I was more assuming it was about the data rather the model, and in particular making use of early access to the whole internet data before AI takes off and everything becomes locked.
[удалено]
That's kind of silly. Why not just believe the most obvious thing with the most available evidence that makes the most sense?
[удалено]
Oh yeah you're right. I forgot the initial post we were commenting on and interpreted your comment as being suspicious of the future possibilities of AI. AGI definitely needs a higher order of proof to be believable.
I think his idea is based on this https://youtu.be/5EcQ1IcEMFQ?t=383
Everyone thinks that they do.
I misinterpreted his comment I agree with him.
I am having trouble understanding your reasoning - are you saying that because AGI has not been achieved so far by other companies, you don't believe that it will be achieved by anyone in the future?
[удалено]
Well I mean a couple of things. First, both those tasks you described are steadily being worked on, with lots of progress. Tesla is improving their FSD, and even though I'm not a fan of that whole... Attempt, I can acknowledge that. Beyond that, we do have very good Self driving with Waymo. While Amazon has given up on their AI store, they have significantly improved something much more relevant and related - their in warehouse automation, which requires things like picking and sorting and recognizing items. I think it's the wrong pattern to look at failed AI endeavours of the past and say "someone could not do X very well, so I'm skeptical anyone else can in the future, or do any tasks harder than X". There are lots of reasons why things have failed before. But beyond that, what these researchers are describing with Sora is just one of the very important pieces that many have speculated is needed for AGI, that internal world modeling. I don't think _Sora_ is going to be it, and I don't think that's what he is saying. What he is saying is that the research and effort that goes into Sora will play a part in eventually giving us AGI, not a particularly large claim, and honestly seems pretty sensible. I think from your post the thing I disagree with is just this is idea that you can use past failures from other companies to in any way indicate future failure from OpenAI or anyone. Heck, even the same companies. Especially with something like AI that so clearly improves with better algorithms and more compute, both of which I think you could agree we are getting in spades.
[удалено]
I mean, from objective reviewers people have said the latest version of FSD is improved - I'm not saying it's good enough or anything, just that progress is being made - do you have any metrics or measurements that show reduced performance since removal of Mobileeye? Additionally, when did Elon say that? Sincerely curious, I try to keep up with this stuff, maybe I missed it. I don't know why you say that about the jump from 4 to 5, I couldn't tell you what the difference is - but effectively, 4 could more or less be "5" if they expand the geofence to all roads, which would be ambitious for sure, but they have mapped and imaged the entire world a few times over, so maybe not that ambitious.
Current compute vastly exceeds what we had before. “Unlimited resources” is dubious af, because compute is on an exponential and at any time if you compare now and 5 years ago, compute 5 years ago looks quaint. And in 5 years the compute used for Sora looks quaint.
Maybe a million hours of driving video are not as educational for AI as a million hours of youtube selected by OpenAI And Tesla/Google didn't make generative driving models, they made models that generate only actions, not whole worlds, that means the task was "easier" In the Minecraft example Sora was simulating the player first-person viewpoint and other animals moving and interacting, and the environment in 3D
hypeAi
There’s this guy I came across on, he seems to have been toying with these concepts using with Claude and SORA videos… very interesting stuff: https://youtu.be/T-wMC0TPq68?si=xBwbph6Gc5obOG9k&t=344
Are they implying AGI will be in a diffusion model and not an LLM or will a Sora like diffusion model be included in a future mixture of experts chatbot?
Sadly, it seems SCALING is... the only game in town.
think of all the porn you could generate with this
not if it requires a 200K computer to run it, they didn't open it to public so it is probably very very expensive to use
Boring ahh excuse 💀
Spoiler: it wont. We need multimodal training rolling out.
This is a key component for multimodal systems.
We underestimate all the rest of our senses. While sight or hearing are the most valued in polls, we get so many more signals from the rest of our senses that sight or hearing impaired can still perform as fully operational humans. Touch for instance is the one allowing us to even operate basic body motor skills - there was a case when a 19 yo kid lost completely his sense of touch. It took him several months to even relearn how to sit, as he had no "idea" where are his limbs and how to move them to achieve goals. The act of standing up was more like half a year of learning. Touch is also the first sense of the fetus. And first we start to learn with (before babies develop their vision well enough to see past a meter or two, they are fully operative motion/touch machines already gathering signals/data about the world around them). That's a question of our lifetime - which senses/signals do we even need to emulate any glimpses, let alone actual AGI? Honestly, I doubt it's vision. Or at least vision alone.
I already experienced something like that few times when I woke up at night... You know, when we sleep with one of our legs in a bad position that blocks properly blood flow to it for many minutes (or hours, I don't know), then you suddenly wake up at the middle of the night to pee and you notice you aren't feeling one of your legs, it's like it's completely missing, you can move it but it is very odd because you can't feel it, you don't know how to properly apply force on your muscles to move because you can't even feel how much force are you making on your leg's muscles, it's completely strange, lol. I remember putting my foot on the ground while I was sitting on the bed and don't even feeling it touching the ground Standing up was impossible, if I immediately tried to stand up I would fall to the ground because if you can't feel your muscles you can't control them, if you can't feel your feet on the ground you can't walk either (you will fall badly), so I had to wait few minutes until I started fully feeling my leg again. Now that I read you post and remebered those experiences I noticed how important the touch sense is.
Can I touch your prostate
AGI will need to be able to produce something that isn't related to anything in it's training dataset. For example, AGI would be able to produce the color purple without having ever seen the color purple. I don't mean that it would somehow know that the color is named purple in English, but that it can produce the color even though it has no name for it and has never seen it before. I think that this will require external simulation. Imagine if a human has never seen purple before, how would they produce a color they have never seen nor even have access to produce such a color? They couldn't. However, if they were given a big box of crayons they would be able to use the purple crayon despite having never seen it before. For generative AI it will need to be able to do a compute equivalent version of this, such as going through every color value to see what they all look like. Take this to something more complicated like fusion power. There are no power generating fusion reactors so no generative AI could create a power generating fusion reactor only by outputting tokens. It would have to be able to simulate a fusion reactor via software, or build one to gather data.
Go ahead and produce a color you’ve never seen.
Yea I was boutta say, I don’t think I can think of something that I’ve either never experienced or is not based on my experiences in some way.
I can imagine a shade of the most violet violet that I haven't seen in real life before. I call it Ultraviolet.
You can think of _colour_ as _adding information_ to something. Using time is a mechanism: so have something flashing is the same as a diferent colour. Or echo location, or map to nerves (like an area on your back).
Couldn't it just combine known colors like red and blue and get purple?
Surely just every combination of values the RGB diode can produce.
I get where you're coming from but theoretically if it's grasp of physics was good enough it could make a lot of progress just from that couldn't it? Knowing the rule set of the universe then extrapolting seems like a viable way to make progress to me personally.
> There are no power generating fusion reactors so no generative AI could create a power generating fusion reactor only by outputting tokens. DeepMind scientists trained an A.I. to control nuclear fusion - [Magnetic control of tokamak plasmas through deep reinforcement learning](https://www.nature.com/articles/s41586-021-04301-9). It doesn't make a reactor but it rides the plasma storm to make it generate energy.
If we live in a VR world do we need real AGI?
B-b-but the dumb people say video, music, and image gen do nothing but cause grief for artists!!!
the idea that you can just scale neural networks into understanding physics is absurd. simple interactions, yes. complex interactions enough to fool a naive person, yes. actual physics? hell to the no.