Stable Diffusion (A1111)

This example runs the popular AUTOMATIC1111/stable-diffusion-webui project on Modal, without modification. We just port the environment setup to a Modal container image and wrap the launch script with a @web_server decorator, and we’re ready to go.

You can run a temporary A1111 server with modal serve a1111_webui.py or deploy it permanently with modal deploy a1111_webui.py.

import subprocess

from modal import App, Image, web_server

PORT = 8000

First, we define the image A1111 will run in. This takes a few steps because A1111 usually install its dependencies on launch via a script. The process may take a few minutes the first time, but subsequent image builds should only take a few seconds.

a1111_image = (
    Image.debian_slim(python_version="3.11")
    .apt_install(
        "wget",
        "git",
        "libgl1",
        "libglib2.0-0",
        "google-perftools",  # For tcmalloc
    )
    .env({"LD_PRELOAD": "/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4"})
    .run_commands(
        "git clone --depth 1 --branch v1.7.0 https://github.com/AUTOMATIC1111/stable-diffusion-webui /webui",
        "python -m venv /webui/venv",
        "cd /webui && . venv/bin/activate && "
        + "python -c 'from modules import launch_utils; launch_utils.prepare_environment()' --xformers",
        gpu="a10g",
    )
    .run_commands(
        "cd /webui && . venv/bin/activate && "
        + "python -c 'from modules import shared_init, initialize; shared_init.initialize(); initialize.initialize()'",
        gpu="a10g",
    )
)

app = App(
    "example-a1111-webui", image=a1111_image
)  # Note: prior to April 2024, "app" was called "stub"

After defining the custom container image, we start the server with accelerate launch. This function is also where you would configure hardware resources, CPU/memory, and timeouts.

If you want to run it with an A100 or H100 GPU, just change gpu="a10g" to gpu="a100" or gpu="h100".

Startup of the web server should finish in under one to three minutes.

@app.function(
    gpu="a10g",
    cpu=2,
    memory=1024,
    timeout=3600,
    # Allows 100 concurrent requests per container.
    allow_concurrent_inputs=100,
    # Keep at least one instance of the server running.
    keep_warm=1,
)
@web_server(port=PORT, startup_timeout=180)
def run():
    START_COMMAND = f"""
cd /webui && \
. venv/bin/activate && \
accelerate launch \
    --num_processes=1 \
    --num_machines=1 \
    --mixed_precision=fp16 \
    --dynamo_backend=inductor \
    --num_cpu_threads_per_process=6 \
    /webui/launch.py \
        --skip-prepare-environment \
        --no-gradio-queue \
        --listen \
        --port {PORT}
"""
    subprocess.Popen(START_COMMAND, shell=True)