GPU acceleration

If you have code or use libraries that are GPU accelerated, you can enable GPU usage for your functions by simply passing the gpu=True argument to your stub.function decorator:

import modal

stub = modal.Stub()

def my_function():
    # code here will be executed on a machine with an available GPU

Using A100 GPUs (alpha)

By default, GPU jobs run on Nvidia Tesla T4s. Modal has experimental support for Nvidia Tesla A100s as well. To use Modal with this GPU type, replace the gpu=True argument with gpu=modal.gpu.A100():

def my_function():


  • Modal A100 workers currently run on a separate cloud provider vs the rest of Modal’s infrastructure. This means that the first time you start up an image with an A100 GPU, there will be an additional latency cost as we transfer image files between cloud providers. However, subsequent runs for that image (including cold starts) will be just as fast as any other Modal function.
  • These functions run on pre-emptible (“spot”) instances, which means that there is a small chance your function can be interrupted. If this happens, in-progress inputs will be rescheduled.
  • Modal SharedVolume data will not be shared between A100 and non-A100 functions, as the underlying file systems are hosted in different places.

We’re actively working on removing all of these constraints, so stay tuned!


Take a look at some of our examples that use GPUs: