Render a video with Blender on many GPUs or CPUs in parallel
This example shows how you can render an animated 3D scene using Blender’s Python interface.
You can run it on CPUs to scale out on one hundred containers or run it on GPUs to get higher throughput per node. Even for this simple scene, GPUs render >10x faster than CPUs.
The final render looks something like this:
Defining a Modal app
Modal runs your Python functions for you in the cloud. You organize your code into apps, collections of functions that work together.
We need to define the environment each function runs in — its container image.
The block below defines a container image, starting from a basic Debian Linux image
adding Blender’s system-level dependencies
and then installing the bpy package, which is Blender’s Python API.
Rendering a single frame
We define a function that renders a single frame. We’ll scale this function out on Modal later.
Functions in Modal are defined along with their hardware and their dependencies. This function can be run with GPU acceleration or without it, and we’ll use a global flag in the code to switch between the two.
We decorate the function with @app.function to define it as a Modal function.
Note that in addition to defining the hardware requirements of the function,
we also specify the container image that the function runs in (the one we defined above).
The details of the scene aren’t too important for this example, but we’ll load a .blend file that we created earlier. This scene contains a rotating Modal logo made of a transmissive ice-like material, with a generated displacement map. The animation keyframes were defined in Blender.
Rendering with acceleration
We can configure the rendering process to use GPU acceleration with NVIDIA CUDA. We select the Cycles rendering engine, which is compatible with CUDA, and then activate the GPU.
Combining frames into a video
Rendering 3D images is fun, and GPUs can make it faster, but rendering 3D videos is better! We add another function to our app, running on a different, simpler container image and different hardware, to combine the frames into a video.
The function to combine the frames into a video takes a sequence of byte sequences, one for each rendered frame, and converts them into a single sequence of bytes, the MP4 file.
Rendering in parallel in the cloud from the comfort of the command line
With these two functions defined, we need only a few more lines to run our rendering at scale on Modal.
First, we need a function that coordinates our functions to render frames and combine them.
We decorate that function with @app.local_entrypoint so that we can run it with modal run blender_video.py.
In that function, we use render.map to map the render function over the range of frames.
We give the local_entrypoint two parameters to control the render — the number of frames to render and how many frames to skip.
These demonstrate a basic pattern for controlling Functions on Modal from a local client.
We collect the bytes from each frame into a list locally and then send it to combine with .remote.
The bytes for the video come back to our local machine, and we write them to a file.
The whole rendering process (for four seconds of 1080p 60 FPS video) takes about three minutes to run on 10 L40S GPUs, with a per-frame latency of about six seconds, and about five minutes to run on 100 CPUs, with a per-frame latency of about one minute.