modal.gpu

GPU configuration shortcodes

You can pass a wide range of str values for the gpu parameter of @app.function.

For instance:

  • gpu="H100" will attach 1 H100 GPU to each container
  • gpu="L40S" will attach 1 L40S GPU to each container
  • gpu="T4:4" will attach 4 T4 GPUs to each container

You can see a list of Modal GPU options in the GPU docs.

Example

Deprecation notes

An older deprecated way to configure GPU is also still supported, but will be removed in future versions of Modal. Examples:

  • gpu=modal.gpu.H100() will attach 1 H100 GPU to each container
  • gpu=modal.gpu.T4(count=4) will attach 4 T4 GPUs to each container
  • gpu=modal.gpu.A100() will attach 1 A100-40GB GPUs to each container
  • gpu=modal.gpu.A100(size="80GB") will attach 1 A100-80GB GPUs to each container

modal.gpu.A100 

NVIDIA A100 Tensor Core GPU class.

The flagship data center GPU of the Ampere architecture. Available in 40GB and 80GB GPU memory configurations.

modal.gpu.A10G 

NVIDIA A10G Tensor Core GPU class.

A mid-tier data center GPU based on the Ampere architecture, providing 24 GB of memory. A10G GPUs deliver up to 3.3x better ML training performance, 3x better ML inference performance, and 3x better graphics performance, in comparison to NVIDIA T4 GPUs.

modal.gpu.Any 

Selects any one of the GPU classes available within Modal, according to availability.

modal.gpu.H100 

NVIDIA H100 Tensor Core GPU class.

The flagship data center GPU of the Hopper architecture. Enhanced support for FP8 precision and a Transformer Engine that provides up to 4X faster training over the prior generation for GPT-3 (175B) models.

modal.gpu.L4 

NVIDIA L4 Tensor Core GPU class.

A mid-tier data center GPU based on the Ada Lovelace architecture, providing 24GB of GPU memory. Includes RTX (ray tracing) support.

modal.gpu.L40S 

NVIDIA L40S GPU class.

The L40S is a data center GPU for the Ada Lovelace architecture. It has 48 GB of on-chip GDDR6 RAM and enhanced support for FP8 precision.

modal.gpu.T4 

NVIDIA T4 Tensor Core GPU class.

A low-cost data center GPU based on the Turing architecture, providing 16GB of GPU memory.