ControlNet - More Precise Control Over AI-Generated Images

Soon after the submission of the paper “Adding Conditional Control to Text-to-Image Diffusion Models” by Lvmin Zhang and Maneesh Agrawala from Stanford University, the trained model based on Stable Diffusion is now available for download on GitHub!

ConrolNet

The paper that submitted to Cornell University on 10^th Feb presents the public a “neural network structure, ControlNet, to control pretrained large diffusion models to support additional input conditions”. According to the paper, the neural network architecture ControlNet is able to let large image diffusion models (like Stable Diffusion) to learn task-specific input conditions and generate fine-controlled results based on the prompt and images input by users.

ControlNet

To put it bluntly, it is to control the pre-trained large model through additional input content, a kind of end-to-end training, and the current stable diffusion is achieved in this way.

ControlNet

By introducing a method to enable image diffusion models like Stable Diffusion to use additional input conditions, ControlNet presents an efficient way to tell an AI model which parts of an input image to keep, bring unprecedented levels of control to images based on the previous text-to-image models.

In the code shared in GitHub, the creators explained ma brunch of conditions that ControlNet can have fine control with based on Stable Diffusion, including:

Canny Edge

ConrolNet - Canny Edge