AI infrastructure for enterprise

Serverless AI inference, large-scale batch processing, sandboxed code execution, and much more.

32,000

companies rely on Modal

12s

to scale up to 100 H100s

20,000

sandboxes you can run concurrently

We've previously managed to break services like GitHub because of our load, so when Modal was able to handle the massive scale of our AI weekend event so smoothly, that meant a lot.

Anton Osika, Founder & CEO

Check out our Use Cases

Language Models Image, Video, 3D Audio Processing Fine-Tuning Batch Processing Sandboxed Code Computational Bio Language Models Image, Video, 3D Audio Processing Fine-Tuning Batch Processing Sandboxed Code Computational Bio

Instant scalability

Scale from zero to thousands of containers in seconds. Our serverless architecture automatically provisions resources when you need them and scales down to zero when you don't.

Talk to an engineer

Features

Flexible Environments

Bring your own image or build one in Python, scale resources as needed, and leverage state-of-the-art GPUs like H100s & A100s for high-performance computing.

Seamless Integrations

Export function logs to Datadog or any OpenTelemetry-compatible provider, and easily mount cloud storage from major providers (S3, R2 etc.).

Data Storage

Manage data effortlessly with storage solutions (network volumes, key-value stores and queues). Provision storage types and interact with them using familiar Python syntax.

Job Scheduling

Take control of your workloads with powerful scheduling. Set up cron jobs, retries, and timeouts, or use batching to optimize resource usage.

Web Endpoints

Deploy and manage web services with ease. Create custom domains, set up streaming and websockets, and serve functions as secure HTTPS endpoints.

Built-In Debugging

Troubleshoot efficiently with built-in debugging tools. Use the modal shell for interactive debugging and set breakpoints to pinpoint issues quickly.

Use Cases

Generative AI Inference that scales with you

Deploy and scale AI models effortlessly with our optimized infrastructure designed for high-performance inference.

View Examples

Fast cold boots

Load gigabytes of weights in seconds with our optimized container file system.

Bring your own code

Deploy anything from custom models to popular frameworks.

Seamless autoscaling

Handle bursty and unpredictable load by scaling to thousands of GPUs and back down to zero.

AI inference interface showing image generation

Security and governance

Built on top of gVisor

The secure application kernel for containers, providing top-tier isolation in multi-tenant setups.

SOC 2 and HIPAA

Fully compliant with SOC 2. Run HIPAA-compliant workloads. We have industry-standard security, availability, and confidentiality.

Region support

Deploy globally with enhanced compliance across geographic regions.

SSO sign in for enterprise

Enterprise-grade SSO for transparent, streamlined access management.

Run AI infrastructure

at scale.