Modal provides multi-GPU training in repeatable environments, as easy as a function call away.

Get started
Multi-GPU training

Fine-tune models on up to 8 GPUs with the sharding technique of your choice . Access up to 640 GB of VRAM with A100 80 GB nodes.


Define environments in code so your fine-tuning runs work are repeatable for your entire team — no more finicky Jupyter notebooks!

On-demand resources

Spawn fine-tuning runs on-demand from your app or your terminal, and pay only for GPUs when you use them. Easily define and run hyperparameter sweeps.

Model storage

Store fine-tuned weights or LoRA adapters in modal.Volume, as easily as writing to local disk. Volumes are optimized for high read throughput, so future cold-start times are blazing fast.

Try it out

Ship your first app in minutes

with $30 / month free compute