Nvidia A100s were first launched in 2020. They quickly became a popular choice for generative AI training and inference workloads due to their increased VRAM and more powerful Tensor Cores compared to predecessors. Though they are no longer the most powerful Nvidia GPU on the market—in particular, compared to H100s—A100s can often still be the best price-to-performance choice for training and deploying large models that are a step below the bleeding edge.
A100 Configurations
A100 pricing varies by configuration. The two dimensions you’ll see most often are GPU memory (40GB vs 80GB) and form factor (PCIe vs SXM). A100s 80GBs are more expensive than the 40GBs—you may need the larger version depending on the size of the model you are fine-tuning or serving.
Between the two form factors, A100s with SXM are more performant and more expensive, because SXM enables faster communication between the GPU and the motherboard.
Direct purchase price from Nvidia
The current market prices for an A100 vary from $8,000 to $10,000 per GPU for the 40GB PCIe model. The 80GB SXM model goes for $18,000 to $20,000
Alternatives to direct purchase: A100s in the cloud
Traditional cloud platforms - AWS, GCP, OCI
While there are certain rare scenarios where a company might want to set up their own data centers, the high upfront cost means that most companies will seek out cloud platforms instead. The traditional approach has been for companies to make reservations of GPUs that typically last for 1-3 years. The market for reservations has been dominated by Amazon, Google, and Oracle.
AWS (Amazon) and GCP (Google) also offer more flexible purchase models for A100s, including spot and on-demand. On-demand allows users to pay for A100s by the second, but this flexibility over reservations come at the cost of a much higher per-hour price. Spot lets users take advantage of unused capacity and is also billed by the second. There are no guarantees that workloads won’t be pre-empted, however, so it is always cheaper than on-demand equivalents.
Below are comparison tables of current A100 GPU per-hour list prices across these platforms.
GPU type | Purchase model | AWS* | GCP* | OCI |
---|---|---|---|---|
A100 40GB SXM | 1 year reservation | $2.52 | $2.31 | $3.05 |
3 year reservation | $1.56 | $1.29 | n/a | |
On-demand | $4.10 | $3.67 | n/a | |
Spot | $1.15 | $1.17 | n/a | |
A100 80GB SXM | 1 year reservation | $3.15 | n/a | $4.00 |
3 year reservation | $1.95 | n/a | n/a | |
On-demand | $5.12 | $5.12 | n/a | |
Spot | n/a | $1.57 | n/a |
*Prices based on us-east-1 for AWS and us-central1 for GCP. Note that prices will vary based on region.
AWS also offers a purchase model called a Savings Plan that offers discounts from on-demand pricing in exchange for a 1 or 3 year commitment to a baseline $/hr usage. It is more flexible than a reservation but comes with a higher price.
It can be tricky to locate information on A100s for these cloud providers because each provider uses a different naming convention. Below is a guide on the instance type names that are associated with A100s for each provider:
GPU type | AWS | GCP | OCI |
---|---|---|---|
A100 40GB SXM | p4d.24xlarge | a2-highgpu-* a2-megagpu-* |
bm-gpu4.8 |
A100 80GB SXM | p4de.24xlarge | a2-ultragpu-* | bm.gpu.a100-v2.8 |
Note that AWS and OCI only offer A100s in configurations of 8 GPUs, while GCP offers configurations of 1 to 16.
Serverless compute startups
Many companies are also exploring alternatives to the hyperscalers, for a few reasons:
- Inflexibility of GPU configurations
- Lack of availability without making a big commitment
- Slow and manual provisioning of resources
- Overhead of configuring and managing infra
Newer, up-and-coming GPU platforms are trying to solve these pain points by offering a more flexible model for accessing and scaling resources. By taking a serverless approach, these platforms spin up GPUs for users only when needed and abstract away the complexities of managing the underlying infra. Below is a comparison of A100 per-hour prices for the most popular serverless GPU providers.
GPU Type | Modal | Lambda Labs | Runpod | Baseten |
---|---|---|---|---|
A100 40GB SXM | $2.78 | $1.29 | n/a | n/a |
A100 80GB SXM | $3.40 | $1.79 | $2.72 | $6.144 |
You may notice that these prices are slightly higher than the spot or reservation prices of the hyperscalers, but these list prices don’t tell the full story. Because serverless options are much quicker to autoscale GPUs for you and only charge based on usage, you can achieve significantly higher utilization (and therefore lower overall cost) on workloads with unpredictable volume.
Interested in running something on A100s? On Modal, you can deploy a function with an A100 attached in a matter of minutes. Try it out!