Member-only story
|ARTIFICIAL INTELLIGENCE |AI ART | STABLE DIFFUSION
ControlNet: control your AI art generation
A new model allows fine control and gets the maximum from stable diffusion

stable diffusion is a revolutionary model that allows you to be able to generate images from text. Can it be improved? Yes, how? Let’s find out in this article.
What is ControlNet?
Stable Diffusion allows you to get high-quality images simply by writing a text prompt. In addition, the template also accepts as input an image from which to start (not just text). Thanks to it and similar templates, today it is easy to generate incredible images in seconds.
On the other hand, the template is not perfect, it is not always easy to write the right prompt, and we do not always get what we want, so alternative ways have been tried.
Broadly speaking, stable diffusion works by using text to conditionally generate an image from noise. ControlNet, a new model published by researchers at Standford, adds another form of conditioning (which I will explain more about in a moment) and thus allows us to be able to control image generation much better.
Stable diffusion starts with noise and begins generating an image using text prompt conditioning (the information extracted from a language model that tells the U-Net how to modify the image). At each step, practically the model adds detail and the noise is removed. During the various steps in latent space what was once noise becomes more and more like an image. After that, the decoder transforms what was noise into an image in the pixel space.