Play with the ControlNet demos
This example allows you to play with all 10 demonstration Gradio apps from the new and amazing ControlNet project. ControlNet provides a minimal interface allowing users to use images to constrain StableDiffusion’s generation process. With ControlNet, users can easily condition the StableDiffusion image generation with different spatial contexts including a depth maps, segmentation maps, scribble drawings, and keypoints!
Imports and config preamble
Below are the configuration objects for all 10 demos provided in the original lllyasviel/ControlNet repo. The demos each depend on their own custom pretrained StableDiffusion model, and these models are 5-6GB each. We can only run one demo at a time, so this module avoids downloading the model and ‘detector’ dependencies for all 10 demos and instead uses the demo configuration object to download only what’s necessary for the chosen demo.
Even just limiting our dependencies setup to what’s required for one demo, the resulting container image is huge.
Pick a demo, any demo
Simply by changing the DEMO_NAME below, you can change which ControlNet demo app is setup
and run by this Modal script.
Setting up the dependencies
ControlNet requires a lot of dependencies which could be fiddly to setup manually, but Modal’s programmatic container image building Python APIs handle this complexity straightforwardly and automatically.
To run any of the 10 demo apps, we need the following:
- a base Python 3 Linux image (we use Debian Slim)
- a bunch of third party PyPi packages
git, so that we can download the ControlNet source code (there’s nocontrolnetPyPi package)- some image process Linux system packages, including
ffmpeg - and demo specific pre-trained model and detector
.pthfiles
That’s a lot! Fortunately, the code below is already written for you that stitches together a working container image ready to produce remarkable ControlNet images.
Note: a ControlNet model pipeline is now available in Huggingface’s diffusers package. But this does not contain the demo apps.
Serving the Gradio web UI
Each ControlNet gradio demo module exposes a block Gradio interface running in queue-mode,
which is initialized in module scope on import and served on 0.0.0.0. We want the block interface object,
but the queueing and launched webserver aren’t compatible with Modal’s serverless web endpoint interface,
so in the import_gradio_app_blocks function we patch out these behaviors.
Because the ControlNet gradio apps are so time and compute intensive to cold-start, the web app function is limited to running just 1 warm container (max_containers=1). This way, while playing with the demos we can pay the cold-start cost once and have all web requests hit the same warm container. Spinning up extra containers to handle additional requests would not be efficient given the cold-start time. We set the scaledown_window to 600 seconds so the container will be kept running for 10 minutes after the last request, to keep the app responsive in case of continued experimentation.
Have fun!
Serve your chosen demo app with modal serve controlnet_gradio_demos.py. If you don’t have any images ready at hand,
try one that’s in the 06_gpu_and_ml/controlnet/demo_images/ folder.
StableDiffusion was already impressive enough, but ControlNet’s ability to so accurately and intuitively constrain the image generation process is sure to put a big, dumb grin on your face.