Render a video with Blender on many GPUs or CPUs in parallel
This example shows how you can render an animated 3D scene using Blender’s Python interface.
You can run it on CPUs to scale out on one hundred containers or run it on GPUs to get higher throughput per node. Even for this simple scene, GPUs render >10x faster than CPUs.
The final render looks something like this:
Defining a Modal app
from pathlib import Path
import modal
Modal runs your Python functions for you in the cloud. You organize your code into apps, collections of functions that work together.
app = modal.App("examples-blender-video")
We need to define the environment each function runs in — its container image.
The block below defines a container image, starting from a basic Debian Linux image
adding Blender’s system-level dependencies
and then installing the bpy
package, which is Blender’s Python API.
rendering_image = (
modal.Image.debian_slim(python_version="3.11")
.apt_install("xorg", "libxkbcommon0") # X11 (Unix GUI) dependencies
.pip_install("bpy==4.1.0") # Blender as a Python package
)
Rendering a single frame
We define a function that renders a single frame. We’ll scale this function out on Modal later.
Functions in Modal are defined along with their hardware and their dependencies. This function can be run with GPU acceleration or without it, and we’ll use a global flag in the code to switch between the two.
WITH_GPU = True # try changing this to False to run rendering massively in parallel on CPUs!
We decorate the function with @app.function
to define it as a Modal function.
Note that in addition to defining the hardware requirements of the function,
we also specify the container image that the function runs in (the one we defined above).
The details of the scene aren’t too important for this example, but we’ll load a .blend file that we created earlier. This scene contains a rotating Modal logo made of a transmissive ice-like material, with a generated displacement map. The animation keyframes were defined in Blender.
@app.function(
gpu="L40S" if WITH_GPU else None,
# default limits on Modal free tier
concurrency_limit=10 if WITH_GPU else 100,
image=rendering_image,
)
def render(blend_file: bytes, frame_number: int = 0) -> bytes:
"""Renders the n-th frame of a Blender file as a PNG."""
import bpy
input_path = "/tmp/input.blend"
output_path = f"/tmp/output-{frame_number}.png"
# Blender requires input as a file.
Path(input_path).write_bytes(blend_file)
bpy.ops.wm.open_mainfile(filepath=input_path)
bpy.context.scene.frame_set(frame_number)
bpy.context.scene.render.filepath = output_path
configure_rendering(bpy.context, with_gpu=WITH_GPU)
bpy.ops.render.render(write_still=True)
# Blender renders image outputs to a file as well.
return Path(output_path).read_bytes()
Rendering with acceleration
We can configure the rendering process to use GPU acceleration with NVIDIA CUDA. We select the Cycles rendering engine, which is compatible with CUDA, and then activate the GPU.
def configure_rendering(ctx, with_gpu: bool):
# configure the rendering process
ctx.scene.render.engine = "CYCLES"
ctx.scene.render.resolution_x = 3000
ctx.scene.render.resolution_y = 2000
ctx.scene.render.resolution_percentage = 50
ctx.scene.cycles.samples = 128
cycles = ctx.preferences.addons["cycles"]
# Use GPU acceleration if available.
if with_gpu:
cycles.preferences.compute_device_type = "CUDA"
ctx.scene.cycles.device = "GPU"
# reload the devices to update the configuration
cycles.preferences.get_devices()
for device in cycles.preferences.devices:
device.use = True
else:
ctx.scene.cycles.device = "CPU"
# report rendering devices -- a nice snippet for debugging and ensuring the accelerators are being used
for dev in cycles.preferences.devices:
print(
f"ID:{dev['id']} Name:{dev['name']} Type:{dev['type']} Use:{dev['use']}"
)
Combining frames into a video
Rendering 3D images is fun, and GPUs can make it faster, but rendering 3D videos is better! We add another function to our app, running on a different, simpler container image and different hardware, to combine the frames into a video.
combination_image = modal.Image.debian_slim(python_version="3.11").apt_install(
"ffmpeg"
)
The function to combine the frames into a video takes a sequence of byte sequences, one for each rendered frame, and converts them into a single sequence of bytes, the MP4 file.
@app.function(image=combination_image)
def combine(frames_bytes: list[bytes], fps: int = 60) -> bytes:
import subprocess
import tempfile
with tempfile.TemporaryDirectory() as tmpdir:
for i, frame_bytes in enumerate(frames_bytes):
frame_path = Path(tmpdir) / f"frame_{i:05}.png"
frame_path.write_bytes(frame_bytes)
out_path = Path(tmpdir) / "output.mp4"
subprocess.run(
f"ffmpeg -framerate {fps} -pattern_type glob -i '{tmpdir}/*.png' -c:v libx264 -pix_fmt yuv420p {out_path}",
shell=True,
)
return out_path.read_bytes()
Rendering in parallel in the cloud from the comfort of the command line
With these two functions defined, we need only a few more lines to run our rendering at scale on Modal.
First, we need a function that coordinates our functions to render
frames and combine
them.
We decorate that function with @app.local_entrypoint
so that we can run it with modal run blender_video.py
.
In that function, we use render.map
to map the render
function over the range of frames.
We give the local_entrypoint
two parameters to control the render — the number of frames to render and how many frames to skip.
These demonstrate a basic pattern for controlling Functions on Modal from a local client.
We collect the bytes from each frame into a list
locally and then send it to combine
with .remote
.
The bytes for the video come back to our local machine, and we write them to a file.
The whole rendering process (for four seconds of 1080p 60 FPS video) takes about three minutes to run on 10 L40S GPUs, with a per-frame latency of about six seconds, and about five minutes to run on 100 CPUs, with a per-frame latency of about one minute.
@app.local_entrypoint()
def main(frame_count: int = 250, frame_skip: int = 1):
output_directory = Path("/tmp") / "render"
output_directory.mkdir(parents=True, exist_ok=True)
input_path = Path(__file__).parent / "IceModal.blend"
blend_bytes = input_path.read_bytes()
args = [
(blend_bytes, frame) for frame in range(1, frame_count + 1, frame_skip)
]
images = list(render.starmap(args))
for i, image in enumerate(images):
frame_path = output_directory / f"frame_{i + 1}.png"
frame_path.write_bytes(image)
print(f"Frame saved to {frame_path}")
video_path = output_directory / "output.mp4"
video_bytes = combine.remote(images)
video_path.write_bytes(video_bytes)
print(f"Video saved to {video_path}")