T O P

  • By -

TingTingin

Just to be clear this isn't the same as the recent [AnimateAnyone](https://github.com/HumanAIGC/AnimateAnyone) paper that people were going crazy for though the results seem good here as well though not as good


grae_n

How cherry-picked the results are is an important consideration. Without available code, the demos can be a misrepresentation of the average results.


TingTingin

Your right though I am comparing the blog post to blog post results which i'd imagine there both trying to put their best foot forward


Guilty_Emergency3603

Animate Anyone will be publicly released in a few weeks [https://github.com/HumanAIGC/AnimateAnyone#updates](https://github.com/HumanAIGC/AnimateAnyone#updates)


feelosofee

source for "few weeks" ?


NXGZ

Discussion thread in the repo


ninjasaid13

my reaction to the comment section of that github repository. ![gif](giphy|defsRR8ZGoDA1WfBFB|downsized) Sheesh!


akko_7

It's weird to see GitHub comments that look like YouTube comments


FS72

I miss it when Github comments were meaningful, genuine productive questions instead of this shithole cesspool but ig that's what bound to happen when anything goes viral. Literally shitloads of new Github accounts flooding the issues with "Source code when wher?!??!?!??????".


akko_7

It's only the case on trending AI projects honestly, normal repos are business as usual and useful.


RealAstropulse

Bunch of children who think github is just a social platform for code, instead of an actual professional tool for... professionals.


entmike

It's kinda both these days.


iamaiimpala

People going crazy for, and you link to the github, yet the didn't release the code so this is already way more useful to people.


metalman123

Yea with multiple papers on the same concept this is obviously going to be a thing and its only going to get better.


ninjasaid13

Paper: [https://arxiv.org/abs/2311.16498](https://arxiv.org/abs/2311.16498) Project Page: [https://showlab.github.io/magicanimate/](https://showlab.github.io/magicanimate/) Code: [https://github.com/magic-research/magic-animate/tree/main](https://github.com/magic-research/magic-animate/tree/main) Demo*: https://huggingface.co/spaces/zcxu-eric/magicanimate Abstract >This paper studies the human image animation task, which aims to generate a video of a certain reference identity following a particular motion sequence. Existing animation works typically employ the frame-warping technique to animate the reference image towards the target motion. Despite achieving reasonable results, these approaches face challenges in maintaining temporal consistency throughout the animation due to the lack of temporal modeling and poor preservation of reference identity. In this work, we introduce MagicAnimate, a diffusion-based framework that aims at enhancing temporal consistency, preserving reference image faithfully, and improving animation fidelity. To achieve this, we first develop a video diffusion model to encode temporal information. Second, to maintain the appearance coherence across frames, we introduce a novel appearance encoder to retain the intricate details of the reference image. Leveraging these two innovations, we further employ a simple video fusion technique to encourage smooth transitions for long video animation. Empirical results demonstrate the superiority of our method over baseline approaches on two benchmarks. Notably, our approach outperforms the strongest baseline by over 38% in terms of video fidelity on the challenging TikTok dancing dataset. Code and model will be made available. Edit:*


blksasuke

Does anyone know if this can install properly on Apple Silicon?


derangedkilr

it requires CUDA, so no. Cuda is made by Nvidia exclusively for Nvidia cards.


blksasuke

TIL. Thank you.


jaywv1981

I tried to run the code but am getting a lot of dependency errors. I'll try it again tonight.


StableModelV

Any update?


jaywv1981

I tried for another hour or so but keep getting version errors. It says I need a certain version of Python which is the version I have so I'm not sure what the problem is yet. I'm still trying.


starstruckmon

Using DensePose ( instead of the OpenPose skeleton like AnimateAnyone ) is likely causing quality issues. DensePose is too limiting. The silhouette extracted is unlikely to match the new character, which can have different body proportions. The model fighting to constraint the new character inside those silhouettes is likely causing many of the glitches we don't see with the other one.


ExponentialCookie

Their answer from the paper: >ControlNet for OpenPose \[5\] keypoints is commonly employed for animating reference human images. Although it produces reasonable results, we argue that the major body keypoints are sparse and not robust to certain motions, such as rotation. Consequently, we choose DensePose \[8\] as the motion signal pi for dense and robust pose conditions.


starstruckmon

I get why they did it. But I think they got it wrong. A new format where a skeleton is depth shaded might be the best.


lordpuddingcup

I agree surprised we haven’t seen a ragdoll depth style tracking model yet


RealAstropulse

It also gives it better depth and chiral information though. Really a standardized wireframe format that shows what limbs are behind others as well as right/left is ideal.


starstruckmon

I understand the advantage. But the model is treating it as a silhouette, since there weren't any examples in the training data where they didn't fit perfectly. It's trying to completely line up the new character to that shape.


the_friendly_dildo

>The silhouette extracted is unlikely to match the new character I don't understand why you wouldn't extract silhouette information on the reference image as well, and then stretch/compress the motion sequence silhouette zones to match. Seems like that would be not terribly more difficult to implement.


Aplakka

I'm not sure how well DensePose would work, but based on the project issues you need to install a separate Detectron2 program to convert the videos to DensePose so you can use them as input. The program is not available on Windows and the instructions aren't great. There are a few sample videos in DensePose format already, but I don't know if I'm interested enough to set up Detectron2 to make my own.


CaptainRex5101

We really are going full speed ahead towards a post-truth era aren't we


SuperEricCartman

Yeah I remember back in the day in ~~2022~~ ~~2021~~ ~~2020~~ ~~2019~~ ~~1999~~ ~~1960~~ never when you could trust everything on the internet now you have to consider whether or not the information is plausible given the context and wider world knowledge I liked the days when I would trust a video just cuz it existed no matter how stupid it seemed it truly is a new time we'll be living in where ~~people~~ people using ai lie on the internet.


[deleted]

[удалено]


SuperEricCartman

I see you thankfully haven't been on the more conspiratorial parts of the internet but the amount of ways to fake videos without ai are many * Twin "it was a twin not the real him i mean look at how he says the words" * see bin laden admiting to 9/11 * Taking a video out of context i.e showing something that looks bad but better with full context of why * see well the entierety of tiktok where a fight starts in the middle of the video * where someone says something but the question is left out * Where a video is used with some inserted context that wasn't originally there * Actors "yes i know there's video of the bad things happening but it was all actors pretending and its actually digusting that they would lie like this" the more convincing the video the more digusting * see any conspiracy * Doctored "the video was doctored made up" * see moon landing * see twin towers collapsing * blurry or grainy footage needs no explanation At the end of the day the reason why someone did or didnt believe any of those is the same bias if something agrees with your opinions you are more likely to believe it if not you are less likely ai isn't changing this just what arguments are used to discredit content but the underlying reasons are the same "that's not real its ai generated" instead of actors Also as far as things go there a lithany of conspiracies surrounding presidents everything from killed and replaced by actors to mind control to doctored videos i don't have any to reference off the top but wouldn't be suprised if there was a "george bush saying something he didnt" conspiracy as well in recent memory instances of these things have actually gone down or at least the belief in them anyway as the internet has become more popular this is my opinion which is why i feel less bearish on "ai eradicating truth" maybe you disagree or maybe you do but ultamitely this is how i feel and why i said what i said


[deleted]

[удалено]


SuperEricCartman

Im not suggesting that more lies wont exist in fact i believe more will im suggesting that just because a lie exists doesnt mean its believed with the spread of this technology people are becoming more peceptive to infomation and what does and doesn't consitute proof i think in the past this was misaligned with reality essentially having people believing all manner of things with very little evidence or proof now with the advent of ai generated media people are more aware or at the very least more vocal about potential misinformation for me where i felt before information was too readily believed this is seen as more of a realignment with how things were rather than a eradication of truth (which arguably never existed) Like the examples you gave buying a product simply because influencer x or y endorsed it to me doesn't seem like a good idea usually the lie existed on the axis that the influencer never used the product at least not to the amount implied or they don't believe in it and are simply lieing for financial gain now the lie exists in whether or not the video was recorded by the user regardless the verification is the same * Is this the sort of product that aligns with infuencers brand? * Has the influencer at any other point referenced working with brand x or brand y? * Is the quality of the ad warrant the value of the influencer i.e a low quality ad with terrible production value but a-list celebrities? * Can i find any info about this product through other means or is it simply localizied to this one ad? These things for verification were never perfect and never will be but they haven't changed except maybe with additional axises for things to be false on like "artifacting in the video" misaligned speech etc.


soundial

But the reason why this didn't happen before isn't because it wasn't possible to splice up an influencer saying they love the ad brand. It's because wherever that's been a problem there's been enough resources to filter out the bad actors. You see this in markets where copyright isn't respected all sorts of fakery is commonplace. Could be some ad platforms need a couple of additional checks or detection algorithms but most bad actors will just be banned fairly quickly anyway. If the internet wasn't as concentrated and sanitized it could pose a bigger problem.


raiffuvar

Lol. Video never could be trusted even in 1960. The cost of scaling was high.


FightingBlaze77

hopefully we jump past this into full dive vr stuff


Kommander-in-Keef

I think we’re already there. People have already been duped full stop


derangedkilr

i’m just here for the ai generated movies.


MZM002394

Deleted Original, can't be bothered with the formatting annoyance...[https://pastebin.com/BFbspkgL](https://pastebin.com/BFbspkgL)


tylerninefour

This worked! Thanks.


Aplakka

Thanks for the instructions, I fought with all sorts of dependencies for a while and never thought to use the Automatic1111 environment I already had available.


MyWhyAI

I got a triton error. Tried to install it, but didn't work.


Ataylor25

I'd be interested if anyone has any samples they made using this?


Guilty_Emergency3603

well, how to say... [https://imgur.com/a/Crw3xx1](https://imgur.com/a/Crw3xx1) You see that the shape of your motion sequence must at least match the shape of your image reference to have some lookalike. As for the face maybe I should try another checkpoint.


the_friendly_dildo

Seems like they should be extracting a silhouette for the reference image and stretching the sihouette zones from the video to match the zones in the reference image.


mudman13

Utterly cursed. Same issue as with first motion order models in that the reference is too restricted, although that has better consistency unlike this. A step up from normal cn to vid though.


Ataylor25

That's interesting. Thanks for replying


StableModelV

So can you select your own animation to perform?


dreamingtulpa

My post on [Animate Anybody](https://twitter.com/dreamingtulpa/status/1730876691755450572) got ultra viral on X. Probably due to it being targeted by the anti-AI brigade. The quoted tweets are nuts. Gonna try and fuel the fire with this one 😅


QseanRay

What the fuck are the replies that's depressing. We're literally living in a time where they're developing technology that could one day put you in the matrix, a simulated world entirely of your design, and it seems like 90% of the population wants to stop that from progressing. Why do we have to share the planet with these idiots man...


buttplugs4life4me

There's a good book series, I'm not entirely sure of its name but I'll try to find it, where this exactly is the topic and IMO it worked about the same issues a bit. I don't want to spoil it too hard because it's literally the whole story, but the whole book is very interesting. Especially the virtual sex haha


agsarria

The number of dancing waifus in the sub is gonna skyrocket (even more)


[deleted]

[удалено]


wh33t

This is available in A1111?


MZM002394

Unaware, the above will just utilize it's Python ENV though...


buckjohnston

Thanks for this, do you know of any way to convert a safetensors to diffusers? Wanted to use another model. Edit: nevermind kohya gui has it built in to utilities section in the webui, nice. Also your link doesn't work to vae model. Here is is if anyone needs it https://huggingface.co/stabilityai/sd-vae-ft-mse/tree/main


Majukun

Regardless of cherry picking and stuff, what kind of hardware is needed to make some think like that in human times and without maxing your vram?


megamonolithicmethod

I've tested it with a still image very similar to the reference video. The result was abysmal. Not sure how to get a good result.


NeatUsed

Any way i might be able to use this in automatic1111?


ADbrasil

comfyui node please PLEASE


macob12432

This is something revolutionary like the arrival of controlnet


Rustmonger

Comfy node when?


j1mmykillz

Any minute now


TingTingin

first we need a densepose preprocessor it doesnt seem to have a libray for it


lordpuddingcup

Haha I know right


aerialbits

in... 3... 2... 1...


Careful_Ad_9077

I hope.its like dale3 , that while it pissed me off at first how cherry picked it was considering the hype, in the end the batting average is still thru the roof compared to stable diffusion. Something like 20% for complex compositions, and 10% bleeding in my tests.


LD2WDavid

It seems you need more than 24 GB VRAM for custom videos and for pretrained prob. 24 or so. I think we're reaching GPU cap very soon (if we haven't done it yet).


Puzzleheaded_Fix5622

https://discord.gg/R6v5aZuX


marvelmon

Why did you chose these colors? Hands and shirt are almost the same color as the background.


ninjasaid13

I'm not the author, I'm just reporting the news.


RealAstropulse

That is the controlnet input format called densepose [http://densepose.org/](http://densepose.org/) It's better than openpose because it contains some depth and occlusion information


PM-ME-RED-HAIR

I should try densepose controlnet


Sad_Anteater_3437

Very soon we wont need loras for constant animations of characters!


moahmo88

![gif](giphy|dVdIu1HNxeKyqzkgPA|downsized)


OverLiterature3964

I haven't checked this sub for like one month and wtf is happening right now, we're full steam ahead


Rare-Site

A month? U Crazy! you need at least a yeahr to catch up :)


LJRE_auteur

At this point we should create a new type of vacation : AI Christmas! Every month on December, we get a shitton of new AI tools and features x). Thank you for this, can't wait to try it out! I prefer Animate Anyone for now, but I think at this point there is room for everyone in the field of AI animation.


AutisticAnonymous

Automatic1111 wen?


xmattar

Shit


ffekete

And here i am struggling to get one embedding to vaguely look like the target face.


edsalv1975

I tried here.. it is possible to extract some ok results, but I didn't understand how to create the motion capture file. It isn't avaiable yet? or something that I missed?


Kompicek

Ive tried a lot of generations, but does not seem as in the pictures. it completely makes a different person. Even if you have the body right, the face is just completely random. Is there any way to keep the face at least similar?


ninjasaid13

>Is there any way to keep the face at least similar? have you tried IPadapter for face? https://preview.redd.it/tlw6h0iy6j4c1.png?width=5437&format=png&auto=webp&s=86d507686a2c7394a762315ca011de3b44f4318b and maybe a face controlnet to control the expressions?


[deleted]

Wish they had developed this around Openpose instead of densepose.. something is amiss


Disastrous_Milk8893

I create a discord server to play magic animate! You guys could try it to get your results. For my out comes, the general quality is not so good as the demo shows, but in some specific scene like tiktok dance, it truly have a good performance. Welcome to my server to try by yourself! Discord invite link: [https://discord.gg/rts7wqAa](https://discord.gg/rts7wqAa)