February 6, 20242 minute read
Introducing: H100s on Modal

NVIDIA’s H100 GPUs are the fastest machine learning accelerators on the market. They have also been almost impossible to get a hold of.

But all Modal users can access them - starting now!

For the largest language models H100s boast up to 4x training speedups and up to 30x inference speedups compared to A100s, according to benchmarks by NVIDIA.

A bar chart showing speedups on H100 GPUs

Source: https://resources.nvidia.com/en-us-tensor-core/nvidia-tensor-core-gpu-datasheet

Faster cards cost more to run, so we recommend exploring them on Modal where the spend is justified. That could be a latency-sensitive application like interactive LLM inference, where every millisecond counts, or a throughput-bound job like fine-tuning a foundation model, where the price-to-performance ratio can result in a lower total cost in some cases.

Our H100s have 80 GB of on-chip DRAM connected by high-bandwidth memory to compute units capable of nearly two thousand teraFLOPS at 16 bit precision.

And for $7.65 per hour per GPU, you can run jobs that use up to 8 GPUs that communicate via NVIDIA NVLink connections with 3.6 TB/s of bisectional bandwidth.

On Modal, you only pay for what you use. Thanks to our robust autoscaling, you can achieve significantly higher utilization and thus lower overall costs compared to fixed GPU reservations.

Whether you’re responding to a burst of inference requests when your app hits the top of HackerNews or launching one thousand ML experiments in parallel right before the deadline, Modal is here to serve all your compute needs without charging you after the job is done.

Getting started is simple: just set "H100" as the desired GPU type in the Modal decorator of the function you want to run remotely.

import modal

app = modal.App()

@app.function(gpu="H100")
def run_gpt5():
    # This will run on Modal's H100s

If you have questions on our H100 support or want to share something incredible you built that uses H100s, please reach out in our community Slack.

Ship your first app in minutes

with $30 / month free compute