modal.gpu

GPU configuration shortcodes

The following are the valid str values for the gpu parameter of @app.function.

  • “t4” → GPU(T4, count=1)
  • “l4” → GPU(L4, count=1)
  • “a100” → GPU(A100-40GB, count=1)
  • “a100-80gb” → GPU(A100-80GB, count=1)
  • “h100” → GPU(H100, count=1)
  • “a10g” → GPU(A10G, count=1)
  • “any” → GPU(Any, count=1)

The shortcodes also support specifying count by suffixing :N to acquire N GPUs. For example, a10g:4 will provision 4 A10G GPUs.

Other configurations can be created using the constructors documented below.

modal.gpu.A100

class A100(modal.gpu._GPUConfig)

NVIDIA A100 Tensor Core GPU class.

The flagship data center GPU of the Ampere architecture. Available in 40GiB and 80GiB GPU memory configurations.

def __init__(
    self,
    *,
    count: int = 1,  # Number of GPUs per container. Defaults to 1.
    size: Union[str, None] = None,  # Select GiB configuration of GPU device: "40GB" or "80GB". Defaults to "40GB".
):

modal.gpu.A10G

class A10G(modal.gpu._GPUConfig)

NVIDIA A10G Tensor Core GPU class.

A mid-tier data center GPU based on the Ampere architecture, providing 24 GiB of memory. A10G GPUs deliver up to 3.3x better ML training performance, 3x better ML inference performance, and 3x better graphics performance, in comparison to NVIDIA T4 GPUs.

def __init__(
    self,
    *,
    # Number of GPUs per container. Defaults to 1.
    # Useful if you have very large models that don't fit on a single GPU.
    count: int = 1,
):

modal.gpu.Any

class Any(modal.gpu._GPUConfig)

Selects any one of the GPU classes available within Modal, according to availability.

def __init__(self, *, count: int = 1):

modal.gpu.H100

class H100(modal.gpu._GPUConfig)

NVIDIA H100 Tensor Core GPU class.

The flagship data center GPU of the Hopper architecture. Enhanced support for FP8 precision and a Transformer Engine that provides up to 4X faster training over the prior generation for GPT-3 (175B) models.

def __init__(
    self,
    *,
    # Number of GPUs per container. Defaults to 1.
    # Useful if you have very large models that don't fit on a single GPU.
    count: int = 1,
):

modal.gpu.L4

class L4(modal.gpu._GPUConfig)

NVIDIA L4 Tensor Core GPU class.

A mid-tier data center GPU based on the Ada Lovelace architecture, providing 24GiB of GPU memory. Includes RTX (ray tracing) support.

def __init__(
    self,
    count: int = 1,  # Number of GPUs per container. Defaults to 1.
):

modal.gpu.T4

class T4(modal.gpu._GPUConfig)

NVIDIA T4 Tensor Core GPU class.

A low-cost data center GPU based on the Turing architecture, providing 16GiB of GPU memory.

def __init__(
    self,
    count: int = 1,  # Number of GPUs per container. Defaults to 1.
):