If you have code or use libraries that are GPU accelerated, you can attach the
first available GPU to your function by passing the
gpu="any" argument to the
import modal stub = modal.Stub() def my_function(): # code here will be executed on a machine with an available GPU ...
Specifying GPU type
gpu="any" is specified, your function runs in a container with access to
a supported GPU. Currently there are Nvidia
Tesla T4 and
A10G instances. If
you need more control, you can pick a specific GPU type by changing this
def my_t4_function(): ... def my_a10g_function(): ...
Using A100 GPUs (alpha)
Modal also has experimental support for A100 GPUs, which are NVIDIA’s flagship data center chip. They have beefier hardware and more GPU memory. However, while in the limited availability phase, you may run into larger queue times to get access to them.
To request this GPU type for running your jobs, replace the
def my_a100_function(): ...
In addition to longer queue times, there are limitations:
- Modal A100 workers currently run on a separate cloud provider vs the rest of Modal’s infrastructure. This means that the first time you start up an image with an A100 GPU, there will be an additional latency cost as we transfer files between cloud providers. However, subsequent runs for that image (including cold starts) will be just as fast as any other Modal function.
- These functions run on pre-emptible (“spot”) instances, which means that there is a small chance your function can be interrupted. If this happens, in-progress inputs will be rescheduled.
We’re actively working on removing all of these constraints, so stay tuned!
Take a look at some of our examples that use GPUs: