Modal has raised an $87M Series B led by Lux Capital. Read more
Modal Training

Train more, configure less

Launch more experiments and training jobs. Spin up single-node experiments or scale to multi-node GPU training instantly.
customer logo

“Modal lets us deploy new ML models in hours rather than weeks. We use it across spam detection, recommendations, audio transcription, and video pipelines, and it’s helped us move faster with far less complexity.”

Mike Cohen, Head of AI & ML Engineering
customer logo

“Modal's user-friendly interface and efficient tools have truly empowered our team to navigate data-intensive tasks with ease, enabling us to achieve our project goals more efficiently.”

Karim Atiyeh, Co-Founder & CTO
Modal Training

Where researchers can run experiments, not ops

Define in code

Define your training function with Modal’s SDK. Easily keep ML dependencies and GPU requirements in sync with application code.

01
image = (
02
    modal.Image.from_registry(
03
        f"nvidia/cuda:{tag}"
04
    )
05
    .uv_pip_install(
06
        "accelerate",
07
        "torch",
08
    )
09
)
10
11
@app.function(gpu="B200:8", image=image)
12
@modal.clustered(size=4, rdma=True)
13
def train_multi_node():
14
    ...

Native storage

Ingest training data from anywhere: Modal’s distributed Volumes, cloud buckets, or your local filesystem.

01
volume = modal.Volume.from_name(
02
    "training_data_vol"
03
)
04
05
@app.function(
06
    volumes={
07
        "/my-s3-mount": modal.CloudBucketMount(
08
            "training_data_s3",
09
            secret=secret,
10
        ),
11
        "/my-volume": volume,
12
    }
13
)
14
def train():
15
    ...

Sub-second startup

Modal’s container stack launches GPUs for your function in < 1s. Fan out experiments to accelerate your research.

Sub-second startup

Speed up training jobs by going multi-node

Speed up training jobs by going multi-node

Scale from 1 GPU to 64 with just one line of code


Spin up a cluster in a second with no minimum commitments


B200, H200, and H100 clusters equipped with Infiniband and private networking

No black boxes. You control the training logic.




Built with Modal

Ship your first app in minutes.

Get Started

$30 / month free compute