Code Sandbox for AI: Isolated Workloads
with Sub-Second Starts

Deploy untrusted code, inference endpoints, and batch jobs in secure environments that scale from zero to thousands of GPUs instantly.

Scale logoQuora logoSubstack logoMeta logoLovable logoSuno logoMistral logoCartesia logo

Direct Answer

What is a code sandbox?

A code sandbox is an isolated execution environment that runs untrusted or experimental code safely, preventing resource conflicts and security breaches. Modern cloud sandboxes provide containerized workspaces with dedicated compute, networking isolation, and automatic cleanup after execution. Modal is a code sandbox platform for AI, developers and data scientists that combines secure code execution with GPU snapshotting for sub-second cold starts and elastic scaling across multi-cloud infrastructure.

  • Sub-second cold startsModal's runtime cold-spins up sandboxes in less than one second.
  • Fully isolated executionEach function invocation runs in a dedicated sandbox with private filesystem, network, and namespace.
  • Elastic autoscalingModal elastically autoscales to 50,000+ concurrent sessions without any pre-warming or capacity reservations.

What you can build

Secure execution for every AI workload

Modal Sandboxes give you isolated, ephemeral environments that scale from zero to thousands of GPUs in seconds with no infrastructure to manage.

Read the docs

LLM inference endpoints

Scale to zero between requests and handle traffic spikes without pre-provisioning capacity.

Batch processing jobs

Run embedding generation, video transcoding, or dataset preprocessing across hundreds of CPUs in parallel.

Untrusted code execution

Run user-submitted scripts safely with CPU and memory limits, network restrictions, and timeout enforcement.

Model training and fine-tuning

Spin up A100 or H100 GPUs on demand and terminate automatically when training completes.

What a code sandbox is

  • Creates a boundary around running code with isolated CPU, memory, GPU, and network resources
  • Each sandbox runs in a fresh container — no shared state, no dependency conflicts, no cross-tenant negotiation
  • Modal replaces YAML configs and Dockerfiles with a Python decorator @app.function(gpu="A100")
  • Scales from zero to 1,000+ GPUs in minutes without pre-warming or capacity reservations
  • Teams run LLM inference, fine-tuning jobs, and batch workloads with identical Python syntax
Blocks Grid

Why secure execution matters now

  • LLM inference spikes unpredictably — sticky infrastructure wastes money; running without isolation risks data leaks
  • Sandboxed execution prevents one user's workload from accessing another's model weights or training data
  • Modal's code-first interface creates a sandbox with one line of code, dynamically defined at runtime
  • Modal's intelligent scheduler routes workloads across clouds to bypass quota limits
  • Teams using Modal scaled to 50,000+ concurrent sessions without pre-warming
Squares

Getting started in 3 steps

Step 1: Install the Modal SDK and write a function (5 min)

Run pip install modal, decorate a Python function with @app.function(gpu="A100"), and define dependencies inline. Modal builds a container image automatically and caches layers for instant reuse. No Dockerfile. No registry push.

Step 2: Deploy your sandbox to the cloud (1 command)

Run modal deploy to push your function live with automatic HTTPS endpoints, autoscaling, and logging enabled. Modal selects a cloud region with GPUs in stock, pulls the cached sandbox container, and starts execution in under one second.

Step 3: Invoke and monitor your workload (real-time)

Call your function via Python client, REST API, or cron schedule, then watch logs and GPU metrics in the Modal dashboard. If 500 requests arrive simultaneously, Modal spawns 500 sandboxes in parallel. Traffic drops to zero — Modal scales down and changes nothing.

Lovable scales 250,000 app creations in 48 hours

1 million sandboxes in 48 hours. Zero pages.

"Modal was the only infrastructure provider that enabled us to reliably run tens of thousands of app creation sessions in an instant. We're excited to build with them for the long term."

Lovable — which hit $76M ARR in 7 months — uses Modal Sandboxes to run LLM-generated code for every creation session. During a viral promotion event with Anthropic, OpenAI, and Google, Modal handled a 2.6x surge in concurrent sessions: over 1 million sandboxes ran during the event, powering up to 25,000 concurrent sandboxes at peak. Lovable's platform team was not paged once across the entire weekend.

Anton Osika, Founder and CEO at Lovable

Lovable app builder interface

Who benefits most

Built for every AI team

AI/ML developers

You need GPU sandboxes for inference and live-training without managing Kubernetes. Modal's deployment flow runs from days to minutes with a Python decorator and automatic autoscaling.

Data scientists

Run experiments on a remote cluster without configuration headaches. Modal parallelizes jobs across hundreds of GPUs within seconds during a session.

AI engineering teams

Lock and test new code safely from production. Code sandboxes let your team make confident bets without threatening your infrastructure model.

ML researchers

Access H100 GPU slots instantly. Modal's multi-cloud capacity pool eliminates quota waits and reserved instance delays.

"We use Modal to run edge inference with <10ms overhead and batch jobs at large scale. Our team loves the platform for the power and flexibility it gives us."

Brian Ichter, Co-founder

"Modal makes it easy to write code that runs on 100s of GPUs in parallel, transcribing podcasts in a fraction of the time."

Mike Cohen, Head of Data

"Everyone here loves Modal because it helps us move so much faster. We rely on it to handle massive spikes in volume for evals, RL environments, and MCP servers."

Aakash Sabharwal, VP of Engineering

"Modal was the only infrastructure provider that enabled us to reliably run tens of thousands of app creation sessions in an instant. We're excited to build with them for the long term."

Anton Osika, CEO & Founder

Join Modal's developer community

Modal Community Slack
Twitter profile @erinseleneErin BoyleML Engineer, Tesla

This tool is awesome. So empowering to have your infra needs met with just a couple decorators. Good people, too!

Twitter profile @jai_chopraJai ChopraProduct, LanceDB

Recently built an app on Lambda and just started to use @modal, the difference is insane! Modal is amazing, virtually no cold start time, onboarding experience is great

Twitter profile @isidoremillerIzzy MillerDevRel, Hex

special shout out to @modal for providing the crucial infrastructure to run this! Modal is the coolest tool I've tried in a really long time. Cannot say enough good things.

Frequently asked questions

What programming languages does a Modal code sandbox support?

Modal natively supports Python, and you can run virtually any language inside a sandbox by installing it as a dependency in your container image. Teams regularly run Node.js, Ruby, PHP, Rust, and shell scripts inside Modal Sandboxes.

How does Modal ensure isolated execution between sandboxes?

Each Modal sandbox runs in its own container with dedicated CPU, memory, and GPU resources. Network namespaces prevent cross-tenant communication and Modal's runtime enforces strict process boundaries, 100x faster than Docker.

Can I use Modal as a secure code execution environment for untrusted user input?

Yes. Modal Sandboxes are designed for exactly this use case. You can define resource limits (CPU, memory, GPU), network restrictions, timeouts, and filesystem access controls. Many companies run user-submitted code through Modal Sandboxes in production.

How fast are Modal's cold starts compared to other cloud sandbox platforms?

Modal achieves sub-second cold starts for pre-cached containers. For GPU workloads, Modal's snapshot technology lets you checkpoint a running container and restore it in under a second on a fresh GPU, something that typically takes 2-10 minutes on other platforms.

Does Modal charge for idle time when sandboxes are not running?

No. Modal charges per second of actual compute usage. When your sandbox finishes executing, billing stops immediately. There are no idle fees, no reserved instance costs, and no minimum usage requirements.

Can I use Modal with existing Docker containers or Kubernetes workloads?

Modal can run workloads that were previously containerized with Docker. You can import existing Docker images and run them on Modal, removing the need to manage Kubernetes clusters or Docker registries yourself.

Run your first sandbox in minutes.

Get Started Free

$30 in free compute to get started.