Custom containers
By default, Modal functions are executed in a Debian Linux container with a
basic Python installation of the same minor version v3.x
as your local Python
interpreter.
Oftentimes you might need some third party Python packages, or some other pre-installed dependencies for your function. Modal provides a few different options to customize the container image.
Additional Python packages
The simplest and most common container modification is to add some third party
Python package, like pandas. To do this you can
create a custom modal.Image
by starting with the Image.debian_slim()
function, and then extending the image by invoking the pip_install
method with
a list of all of the packages you need.
from modal import Image
pandas_image = Image.debian_slim().pip_install("pandas", "numpy")
@stub.function(image=pandas_image)
def my_function():
import pandas as pd
import numpy as np
df = pd.DataFrame()
...
Importing Python packages
You might want to use packages inside your Modal code that you don’t have on
your local computer. In the example above, we build a container that uses
pandas
. But if we don’t have pandas locally, on the computer launching the
Modal job, we can’t put import pandas
at the top of the script, since it would
cause an ImportError
.
The easiest solution to this is to put import pandas
in the function body
instead, as you can see above. This means that pandas
is only imported when
running inside the remote Modal container, which has pandas
installed.
If you have a lot of functions and a lot of Python packages, you might want to
keep the imports in the global scope so that every function can use the same
imports. In that case, you can use the run_inside()
context manager:
from modal import Image
pandas_image = Image.debian_slim().pip_install("pandas", "numpy")
with pandas_image.run_inside():
import pandas as pd
import numpy as np
@stub.function(image=pandas_image)
def my_function():
df = pd.DataFrame()
Note that run_inside
is considered beta.
Shell commands
You can also supply shell commands that should be executed when building the container image. This can be useful for installing additional binary dependencies:
from modal import Image
ffmpeg_image = Image.debian_slim().apt_install("ffmpeg")
@stub.function(image=ffmpeg_image)
def process_video():
subprocess.call(["ffmpeg", ...])
Or for preloading custom assets into the container:
from modal import Image
image_with_model = (
Image.debian_slim().apt_install("curl").run_commands(
"curl -O https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalcatface.xml",
)
)
@stub.function(image=image_with_model)
def find_cats():
content = open("/haarcascade_frontalcatface.xml").read()
...
Using existing Docker Hub images
Docker Hub has many pre-built images for common
software packages. You can use any public image in your function using
Image.from_registry
, as long as:
- Python 3.7 or above is present, and is available as
python
pip
is installed correctly- The image is built for the
linux/amd64
platform
from modal import Image
sklearn_image = Image.from_registry("huanjason/scikit-learn")
@stub.function(image=sklearn_image)
def fit_knn():
from sklearn.neighbors import KNeighborsClassifier
...
If python
or pip
isn’t set up properly, we provide an add_python
argument
that installs a reproducible,
standalone build of
Python:
from modal import Image
image1 = Image.from_registry("ubuntu:22.04", add_python="3.11")
image2 = Image.from_registry("gisops/valhalla:latest", add_python="3.11")
The from_registry
function can load images from all public registries, such as
Nvidia’s nvcr.io
,
AWS ECR, and
GitHub’s ghcr.io
.
We also support access to private AWS ECR and GCP Artifact Registry images.
Using Conda instead of pip
Modal provides a pre-built Conda base image, if you would like to use conda
for package management. The Python version available is whatever version the
official miniconda3 image
currently comes with (3.9.12
at this time).
from modal import Image
pymc_image = Image.conda().conda_install("theano-pymc==1.1.2", "pymc3==3.11.2")
@stub.function(image=pymc_image)
def fit():
import pymc3 as pm
...
Using a Dockerfile
Modal also supports using a Dockerfile using the Image.from_dockerfile
function. It takes a path to an existing Dockerfile. For instance:
FROM python:3.9
RUN pip install sklearn
from modal import Image
dockerfile_image = Image.from_dockerfile("Dockerfile")
@stub.function(image=dockerfile_image)
def fit():
import sklearn
...
Dockerfile command compatibility
Since Modal doesn’t use Docker to build containers, we have our own implementation of the Dockerfile specification. Most Dockerfiles should work out of the box, but there are some differences to be aware of.
First, a few minor Dockerfile commands and flags have not been implemented yet. Please reach out to us if your use case requires any of these.
Next, there are some command-specific things that may be useful when porting a Dockerfile to Modal.
ENTRYPOINT
While the
ENTRYPOINT
command is supported, there is an additional constraint to the entrypoint script
provided: it must also exec
the arguments passed to it at some point. This is
so that Modal’s own Python entrypoint can run after your own. Most entrypoint
scripts in Docker containers are typically “wrapper” scripts so this is already
the case.
If you wish to write your own entrypoint script, you can use the following as a template:
#!/usr/bin/env bash
# Your custom startup commands here.
exec "$@" # Runs the command passed to the entrypoint script.
If the above file is saved as /usr/bin/my_entrypoint.sh
in your container,
then you can register it as an entrypoint with
ENTRYPOINT ["/usr/bin/my_entrypoint.sh"]
in your Dockerfile, or with
dockerfile_commands
as an
Image build step.
from modal import Image
Image.debian_slim().pip_install("foo").dockerfile_commands('ENTRYPOINT ["/usr/bin/my_entrypoint.sh"]')
ENV
While simple ENV
commands are supported, we don’t currently support
environment replacement.
This means you can’t yet do ENV PATH=$PATH:/foo
.
A work-around is to use ENTRYPOINT
instead, and use regular bash commands to
achieve this.
Running a function as a build step (beta)
Instead of using shell commands, you can also run a Python function as an image
build step using the
Image.run_function
method. For
example, you can use this to download model parameters to your image:
from modal import Image, Secret
def download_models():
import diffusers
pipe = diffusers.StableDiffusionPipeline.from_pretrained(
model_id, use_auth_token=os.environ["HF_TOKEN"]
)
pipe.save_pretrained("/model")
image = (
Image.debian_slim()
.pip_install("diffusers[torch]", "transformers", "ftfy", "accelerate")
.run_function(download_models, secrets=[Secret.from_name("huggingface")])
)
Any kwargs accepted by @stub.function
(such as Mount
s,
NetworkFileSystem
s, and resource requests)
can be supplied to it. Essentially, this is equivalent to running a Modal
function and snapshotting the resulting filesystem as an image.
Please see the reference documentation for an explanation of which changes to your build function trigger image rebuilds.
Forcing an image to rebuild
Modal uses the definition of an image to determine whether it needs to be
rebuilt. In some cases, you may want to force an image to rebuild, even if the
definition hasn’t changed. You can do this by adding the force_build=True
argument to any of the image build steps.
from modal import Image
image = (
Image.debian_slim()
.apt_install("git")
.pip_install("slack-sdk", force_build=True)
.run_commands("echo hi")
)
In the above example, both pip_install("slack-sdk")
and
run_commands("echo hi")
will run again, but apt_install("git")
will not.
Remember to remove force_build=True
after you’ve rebuilt the image, otherwise
it will rebuild every time you run your code.