Animate images with Lightricks LTX-Video via CLI, API, and web UI
This example shows how to run LTX-Video on Modal to generate videos from your local command line, via an API, and in a web UI.
Generating a 5 second video takes ~1 minute from cold start. Once the container is warm, a 5 second video takes ~15 seconds.
Here is a sample we generated:
Basic setup
All Modal programs need an App —
an object that acts as a recipe for the application.
Configuring dependencies
The model runs remotely, on Modal’s cloud, which means we need to define the environment it runs in.
Below, we start from a lightweight base Linux image
and then install our system and Python dependencies,
like Hugging Face’s diffusers library and torch.
Storing model weights on Modal
We also need the parameters of the model remotely. They can be loaded at runtime from Hugging Face, based on a repository ID and a revision (aka a commit SHA).
Hugging Face will also cache the weights to disk once they’re downloaded. But Modal Functions are serverless, and so even disks are ephemeral, which means the weights would get re-downloaded every time we spin up a new instance.
We can fix this — without any modifications to Hugging Face’s model loading code! — by pointing the Hugging Face cache at a Modal Volume. For more on storing model weights on Modal, see this guide.
Storing model outputs on Modal
Contemporary video models can take a long time to run and they produce large outputs. That makes them a great candidate for storage on Modal Volumes as well. Python code running outside of Modal can also access this storage, as we’ll see below.
Implementing LTX-Video inference on Modal
We wrap the inference logic in a Modal Cls that ensures models are loaded and then moved to the GPU once when a new instance starts, rather than every time we run it.
The run function just wraps a diffusers pipeline.
It saves the generated video to a Modal Volume, and returns the filename.
We also include a web wrapper that makes it possible
to trigger inference via an API call.
For details, see the /docs route of the URL ending in inference-web.modal.run that appears when you deploy the app.
Generating videos from the command line
We add a local entrypoint that calls the Inference.run method to run inference from the command line.
The function’s parameters are automatically turned into a CLI.
Run it with
You can also pass --help to see the full list of arguments.
Generating videos via an API
The Modal Cls above also included a fastapi_endpoint,
which adds a simple web API to the inference method.
To try it out, run
copy the printed URL ending in inference-web.modal.run,
and add /docs to the end. This will bring up the interactive
Swagger/OpenAPI docs for the endpoint.
Generating videos in a web UI
Lastly, we add a simple front-end web UI (written in Alpine.js) for our image to video backend.
This is also deployed when you run
The Inference class will serve multiple users from its own auto-scaling pool of warm GPU containers automatically,
and they will spin down when there are no requests.