GPU acceleration

If you have code or use libraries that are GPU accelerated, you can attach the first available GPU to your function by passing the gpu="any" argument to the @stub.function decorator:

import modal

stub = modal.Stub()

@stub.function(gpu="any")
def my_function():
    # code here will be executed on a machine with an available GPU
    ...

Specifying GPU type

When gpu="any" is specified, your function runs in a container with access to a supported GPU. Currently there are Nvidia Tesla T4 and A10G instances. If you need more control, you can pick a specific GPU type by changing this argument.

@stub.function(gpu="T4")
def my_t4_function():
    ...

@stub.function(gpu="A10G")
def my_a10g_function():
    ...

Using A100 GPUs (alpha)

Modal also has experimental support for A100 GPUs, which are NVIDIA’s flagship data center chip. They have beefier hardware and more GPU memory. However, while in the limited availability phase, you may run into larger queue times to get access to them.

To request this GPU type for running your jobs, replace the gpu="any" argument with gpu="A100":

@stub.function(gpu="A100")
def my_a100_function():
    ...

In addition to longer queue times, there are limitations:

  • Modal A100 workers currently run on a separate cloud provider vs the rest of Modal’s infrastructure. This means that the first time you start up an image with an A100 GPU, there will be an additional latency cost as we transfer files between cloud providers. However, subsequent runs for that image (including cold starts) will be just as fast as any other Modal function.
  • These functions run on pre-emptible (“spot”) instances, which means that there is a small chance your function can be interrupted. If this happens, in-progress inputs will be rescheduled.

We’re actively working on removing all of these constraints, so stay tuned!

Examples

Take a look at some of our examples that use GPUs: