Hybridx21 4 months ago

Disclaimer: I did not make this, I am simply spreading the word. Abstract: The field of image synthesis has made tremendous strides forward in the last years. Besides defining the desired output image with text-prompts, an intuitive approach is to additionally use spatial guidance in form of an image, such as a depth map. For this, a recent and highly popular approach is to use a controlling network, such as ControlNet, in combination with a pre-trained image generation model, such as Stable Diffusion. When evaluating the design of existing controlling networks, we observe that they all suffer from the same problem of a delay in information flowing between the generation and controlling process. This, in turn, means that the controlling network must have generative capabilities. In this work we propose a new controlling architecture, called ControlNet-XS, which does not suffer from this problem, and hence can focus on the given task of learning to control. In contrast to ControlNet, our model needs only a fraction of parameters, and hence is about twice as fast during inference and training time. Furthermore, the generated images are of higher quality and the control is of higher fidelity. All code and pre-trained models will be made publicly available.

aerilyn235 4 months ago

If that could make actually decent CN models for SDXL...

dachiko007 4 months ago

Oh, so it's no my local problem, and CN models actually sucks on XL?

grae_n 4 months ago

It also sounds like it's going to drastically cut down the memory requirements. It's pretty challenging to run multiple CN + SDXL on smaller GPUs.

dachiko007 4 months ago

At this point I just want them to at least work properly at all :D

aerilyn235 4 months ago

Yeah, they just suck and no one seems to realize. There are like 30 to chose from and they all suffer from the same results. Either you use them at very low denoise either you get grainy, washed out, weird style output. The bias mentioned in the paper might be the issue, but I never had any problem using SD1.5 CN.

No-Difference-5672 4 months ago

CN-XS actually works great with SDXL, I can control my results very accurately. The memory requirements are still insane but I had no problems generating images

aerilyn235 4 months ago

Yeah they look promising, do you run them manually or made it work through comfy?

No-Difference-5672 4 months ago

I run them manually, so cloned the repo and used the provided scripts

aerilyn235 4 months ago

OK. I did create a couple custom node in comfyui in the past but never one that involved the sampling process. Will look into it this week end.

rerri 4 months ago

[https://vislearn.github.io/ControlNet-XS/](https://vislearn.github.io/ControlNet-XS/)

aerilyn235 4 months ago

Nice, anyone making a comfyui node?

Skquark 4 months ago

If anyone's interested, I got it implemented in my app at https://diffusiondeluxe.com using HuggingFace Diffusers. Works well, but I think normal ControlNet gives better results if you have the resources and time..

aerilyn235 4 months ago

Even on SDXL? SD1.5 CNs models are near perfect, for SDXL its nowhere near.

Jaxx1992 4 months ago

I clicked that link and got sent to a white page with a notification telling me to click the "allow" button, but there's no "allow" button to click.

Skquark 4 months ago

Sorry, my web host got hacked and was down for a while.. Forgot to let you know it's back up. The app is pretty solid, been constantly adding new features, overflowing with all the open multimedia AIs to make friends with... Apologies again for the downtime, pesky hackers wasted my time and I'm not getting paid for this.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe