Infrastructure

Best Code Execution Sandbox for Sourcegraph Amp in 2026

Amp (formerly developed at Sourcegraph) and similar AI coding tools are transforming how developers write and ship software. These AI-powered systems generate code, execute tasks, and iterate autonomously, but they require robust sandbox infrastructure to run securely at scale. The right code execution sandbox determines whether your AI coding tools can execute untrusted code safely, scale without manual intervention, and access GPU acceleration when ML workloads demand it.

Modal TeamEngineering
June 202620 min read
Best Code Execution Sandbox for Sourcegraph Amp

Key Takeaways

  • Secure sandboxed execution is essential for AI coding tools: AI agents generate and run code autonomously, making isolation critical. Modal uses gVisor containers for compute isolation, supporting 100,000+ concurrent sandboxes for multi-tenant AI applications
  • GPU support separates ML-capable sandboxes from CPU-only options: Modal offers one of the broadest GPU catalogs among sandbox-oriented AI infrastructure platforms, with on-demand access to T4, L4, A10, L40S, A100 variants, RTX PRO 6000, H100, H200, and B200/B200+, enabling ML inference and training within sandboxes. Modal stands out for the breadth and serverless integration of its GPU offering
  • Cold start performance impacts interactive agent experiences: Sandbox platforms optimize for different trade-offs. Daytona supports sandbox creation, while Modal offers fast cold starts through an optimized container runtime and filesystem; for Functions, CPU Memory Snapshots can reduce cold-start latency and GPU Memory Snapshots are available as an alpha feature, while for Sandboxes, filesystem, directory, and alpha memory snapshots are supported
  • Enterprise compliance enables production deployment: Modal has completed a SOC 2 Type 2 audit and supports HIPAA-compliant workloads on Enterprise plans via a Business Associate Agreement, subject to the customer's own compliance obligations
  • Code-first SDKs accelerate development: Modal lets teams define compute, storage, and networking directly in code with no YAML configuration, with SDKs available in Python, TypeScript, and Go, and Sandboxes able to run code in any language the workload requires

1. Modal

Modal delivers serverless compute for secure code execution at scale, the core sandbox workload for AI coding tools like Amp (formerly developed at Sourcegraph), with on-demand GPU access layered on top for workloads requiring acceleration. The platform takes your code, containerizes it, and executes it in the cloud with automatic scaling. Modal is code-first: infrastructure is defined directly in code rather than YAML, with SDKs available in Python, TypeScript, and Go for defining apps and Functions, using Sandboxes, calling deployed Functions, and managing Modal resources. Code running inside a Sandbox is not limited to one language; a Sandbox can run whatever runtime or language the workload requires.

Core Capabilities

  • gVisor container isolation: Secure sandboxed execution for running AI-generated code, the primary workload for coding-agent sandboxes
  • Massive concurrency: Support for 100,000+ concurrent sandboxes, backed by a custom scheduler and multi-cloud capacity pool, with fast scheduling and strong startup performance
  • Fast startup: Engineered for fast cold starts and faster feedback loops, with an optimized filesystem that helps containers come online quickly without letting large images slow startup down. Modal's optimized container stack delivers fast cold starts for Sandboxes. Memory Snapshots can reduce Function cold-start latency; GPU Memory Snapshots are alpha. Sandbox memory snapshots are alpha
  • On-demand GPU access: Modal's priced GPU catalog includes 10 GPU SKUs, and its docs expose GPU request values including T4, L4, A10, L40S, A100/A100-40GB/A100-80GB, RTX-PRO-6000, H100/H100!, H200, and B200/B200+, enabling ML inference, training, and compute-intensive analysis within sandboxes
  • Code-first SDKs in Python, TypeScript, and Go: Modal provides SDKs in Python, TypeScript, and Go for defining apps and Functions, using Sandboxes, calling deployed Functions, and managing resources; infrastructure is defined in code with no YAML or config files required, and Sandboxes can run code in any language the workload requires

Security and Compliance

Modal maintains comprehensive security practices including completion of a SOC 2 Type 2 audit. Modal supports HIPAA-compliant workloads on Enterprise plans via a BAA, subject to the customer's own compliance obligations. The platform uses gVisor-based sandboxing for compute isolation, TLS 1.3 for public APIs, and encryption for data in transit and at rest.

Production-Proven Results

Modal powers cloud infrastructure for over 10,000 teams:

  • Ramp uses Modal Sandboxes to power background coding agents that generate code changes and write them back as commits or pull requests
  • Lovable uses Modal Sandboxes for preview environments of generated applications
  • Modal pools hardware across multiple clouds to improve GPU availability and provide access to GPUs without customer-managed cloud quota requests or reservations

What Makes Modal Unique

  • AI-native container runtime: Custom-built infrastructure including file system, container runtime, scheduler, and image builder optimized for AI workloads
  • Memory snapshotting: Memory Snapshots can reduce cold-start latency for initialization-heavy Functions; GPU Memory Snapshots are available as an alpha feature, and Sandbox memory snapshots are alpha
  • Multi-cloud capacity pool: Deep CPU and GPU capacity across major cloud providers ensures availability without reservations
  • Cloud marketplace access: Enterprise customers can transact through AWS and GCP marketplaces to use committed cloud spend on Modal compute

Best For: Teams building AI coding tools that need secure code execution at scale, with on-demand GPU access when workloads call for ML inference, model fine-tuning, or compute-intensive analysis, especially those seeking production-grade infrastructure with proven enterprise scale.

2. E2B

E2B specializes in secure sandboxes for AI agents, focusing on ephemeral code execution with Firecracker microVM isolation. E2B publishes plan limits up to 1,100 concurrent sandboxes with add-ons and enterprise options, and is used by companies including Perplexity, Hugging Face, and Groq.

Core Capabilities

  • Firecracker microVMs: Hardware-level isolation providing strong security boundaries for running untrusted AI-generated code
  • Cold starts: E2B supports cold starts for ephemeral code execution
  • Multi-language SDKs: Support for Python and TypeScript/JavaScript integration patterns
  • Template system: Reproducible sandbox environments with versioning for consistent agent execution
  • Model Context Protocol (MCP): Native MCP support for standardized agent tool integration

Use Case Focus

E2B excels at ephemeral code execution, spinning up isolated environments for agents to run generated code, then tearing them down. The platform supports up to 1,100 concurrent sandboxes on higher-tier plans with add-ons.

Architecture Approach

E2B's Firecracker microVM isolation provides kernel-level security boundaries. Each sandbox runs in its own microVM with dedicated kernel, offering strong isolation for untrusted code execution. The platform is purpose-built for AI agent workflows with a clean SDK design.

Best For: Teams building coding agents focused on code execution and testing where GPU acceleration is not required, particularly those prioritizing the strongest possible security isolation for untrusted code.

3. Daytona

Daytona provides development environments with sandbox creation capabilities. The platform's open source GitHub repository has accumulated 72,300+ stars and offers both GPU support and configurable runtime persistence.

Core Capabilities

  • Cold starts: Daytona supports sandbox creation and snapshot-based resume
  • Container isolation: Daytona advertises isolated sandbox environments built from Docker/OCI-compatible snapshots; self-hosted runner deployments may involve Docker/Sysbox components
  • Configurable runtime persistence: Sandboxes can be configured for indefinite runtime with snapshot-based resume capabilities
  • GPU support: Available for ML workloads including H100 and RTX PRO options
  • Docker/OCI compatibility: Standard container image support for flexible environment configuration

Architecture Approach

Daytona focuses on persistent workspaces that maintain state across sessions. This approach benefits agents that need to preserve context, cached dependencies, or intermediate results without recreation overhead. The workspace object semantics include auto-stop, archive, and warm start capabilities.

Use Case Focus

Daytona is well-suited for interactive "computer use" agents that spin up many environments. The platform's workspace persistence model supports agents that benefit from continuity across sessions.

Best For: Teams building coding agents that need persistent development environments with workspace continuity.

4. Northflank

Northflank is a full platform-as-a-service with sandbox capabilities, serving 80k+ developers in production and processing 130B+ requests. The platform offers unique flexibility in isolation technology and deployment options.

Core Capabilities

  • Flexible isolation options: Choose between Firecracker, Kata, or gVisor per workload based on security and performance requirements
  • Broad BYOC multi-cloud: Northflank offers unusually broad bring-your-own-cloud deployment across AWS, GCP, Azure, CoreWeave, Oracle, Civo, and on-prem/on-Kubernetes environments
  • GPU support: Available including L4 through H200 options
  • Full platform integration: Sandboxes as part of broader infrastructure including APIs, databases, and workers
  • Enterprise SSO: SAML/OIDC support with comprehensive RBAC controls

Architecture Approach

Northflank's architecture flexibility allows teams to select isolation technology based on workload requirements. The platform supports Firecracker microVMs, Kata containers, or gVisor, configurable per workload.

Use Case Focus

Northflank excels for enterprise teams with specific compliance or data sovereignty requirements. The BYOC capability enables deployment in your own cloud account while maintaining platform benefits.

Best For: Enterprise teams with BYOC requirements, compliance mandates requiring data residency, or need for flexible isolation technology selection per workload.

5. Together Code Sandbox

Together Code Sandbox is a managed sandbox environment for AI-powered coding tools, part of Together AI's broader platform. The company raised a $305M Series B at a $3.3B valuation; total funding is reported at about $534M, positioning it as a well-funded option in the space.

Core Capabilities

  • Configurable VM sandboxes: Fully configurable development environments where users can run code, install dependencies, and run servers inside sandboxed VMs
  • Startup times: Sandboxes can be spun up from a template
  • Programmatic code execution: SDK supports programmatic creation of development environments and execution of untrusted code
  • Together AI ecosystem integration: Tight integration with Together AI's inference and model offerings
  • CodeSandbox-backed environments: Development environments with SDK control, terminal access, port previews, sessions, Docker/Dev Container support, and persistent storage

Use Case Focus

Together Code Sandbox is geared toward building and scaling AI coding tools that need isolated development environments. The platform is positioned around secure code sandboxes for AI development environments and AI-powered coding workflows, with project persistence capabilities.

Best For: Teams building AI coding tools that benefit from integration with Together AI's broader inference and model ecosystem.

6. Cloudflare Sandboxes

Cloudflare Sandboxes is a code execution environment exposed through the Sandbox SDK, leveraging Cloudflare's global edge network for code execution. The platform is positioned for Python and Node.js workloads with TypeScript-first SDK design.

Core Capabilities

  • Python and Node.js execution: Support for running Python scripts, Node.js applications, code compilation, and data-processing workloads
  • Global edge network: Cloudflare's global network reaches a large share of Internet-connected users; actual Sandbox latency depends on workload, placement, startup, and request path
  • TypeScript-first SDK: API for sandbox lifecycle management, command execution, file operations, terminal access, and WebSocket connections
  • Cloudflare ecosystem integration: Integrates with Workers and supports storage and backups through R2/S3/GCS-compatible storage
  • Configurable persistence: Support for keepAlive and configurable sleep behavior

Architecture Approach

Each sandbox has an isolated filesystem, runs in a dedicated Linux container, and maintains state while active. The platform is centered around edge execution, making it well-suited for globally distributed workloads.

Use Case Focus

Cloudflare Sandboxes is framed around secure code execution and programmable sandbox workflows. The platform includes tutorials for AI code executors and AI coding agents built with agent SDKs.

Best For: Teams looking for edge-distributed code execution with Cloudflare ecosystem integration, particularly those preferring a TypeScript-first development model.

7. Vercel Sandbox

Vercel Sandbox is an isolated code execution environment built for running untrusted code in temporary Linux microVMs. The platform is powered by Firecracker and fits into Vercel's broader ecosystem.

Core Capabilities

  • Firecracker microVMs: Each environment runs in an on-demand Linux microVM with its own filesystem, network, and process space
  • Ephemeral runtime model: Sandboxes are temporary by design, started when needed and stopped after use
  • Developer-friendly Linux access: Full Linux environment with sudo, package managers, and standard command-line workflows
  • State persistence options: Vercel supports snapshots for saving and restoring state; otherwise sandbox filesystem data is destroyed when the sandbox stops, and snapshots expire 30 days after last use by default
  • Vercel ecosystem fit: Fits naturally into the Vercel/Next.js ecosystem and has documented AI-agent examples

Architecture Approach

Vercel Sandbox is best understood as an execution layer for secure, isolated code running rather than a full infrastructure platform for GPU-heavy AI workloads. Its fit is strongest for agent workflows that involve repeated start-run-stop cycles and short-lived tasks.

Use Case Focus

The platform supports session durations of 45 minutes on Hobby plans and 5 hours on Pro plans, making it suitable for interactive development sessions rather than long-running workloads.

Best For: Teams that need isolated environments for code execution within the Vercel/Next.js ecosystem, especially when building AI features that fit alongside documented AI-agent examples.

Why Modal Stands Out for Amp and AI Coding Tools

Purpose-Built for AI Agent Workloads

Modal's architecture is specifically engineered for AI and machine learning workloads. The platform's custom container runtime, scheduler, and file system are optimized for the unique demands of elastic infrastructure with fast cold starts, sandboxed code execution, GPU-accelerated computation, and dynamic scaling that AI coding tools require.

Secure Sandboxed Execution at Scale

Most coding-agent sandbox work involves CPU-based execution of the code the agent generates, and Modal's sandboxes are built to handle that workload at massive scale. The platform supports 100,000+ concurrent sandboxes with fast scheduling, gVisor isolation, and sandbox health and lifecycle tracking, readiness probes, detailed logs, metrics, and real-time resource visibility, essential for AI coding tools like Amp that generate and execute untrusted code.

On-Demand GPU Access When Workloads Require It

On top of the CPU baseline, agents can access GPUs on demand when workloads require acceleration, a key differentiator for a sandbox platform. Modal supports a broad GPU lineup including T4, L4, A10, L40S, A100 variants, RTX PRO 6000, H100, H200, and B200/B200+, letting agents match compute to the task at hand, whether running lightweight code analysis models or large language models for code generation.

Developer Experience Without Compromise

Modal is code-first: infrastructure is defined directly in code rather than YAML, with SDKs available in Python, TypeScript, and Go for defining apps and Functions, using Sandboxes, calling deployed Functions, and managing Modal resources. Sandboxes can run code in any language the workload requires. Teams define compute requirements, container images, and scaling behavior directly in code, with no YAML or config files required. Explore the Modal documentation to see how this approach enables rapid iteration.

Enterprise Security and Compliance

Modal has completed a SOC 2 Type 2 audit and supports HIPAA-compliant workloads on Enterprise plans via a BAA, subject to the customer's own compliance obligations. Combined with comprehensive security practices including gVisor sandboxing and TLS 1.3, these can help satisfy common enterprise and healthcare security requirements. Audit logs are available on Enterprise. Modal also supports container region selection for Functions and Sandboxes.

Production-Proven Scale

Modal powers cloud infrastructure for over 10,000 teams including AI companies like Ramp and Lovable. This production track record demonstrates the platform's ability to handle enterprise-scale AI coding tool workloads reliably.

For teams integrating Amp or building AI coding tools that require secure code execution, production-grade reliability, and on-demand CPU and GPU access, Modal's combination of AI-native infrastructure, sandboxed execution at scale, and proven enterprise capabilities makes it the standout choice.

Get started with Modal Sandboxes to power your AI coding tools.

View Modal Sandboxes

Frequently asked questions

What is a code execution sandbox and why is it crucial for AI development?

A code execution sandbox is an isolated environment where code can run without accessing host systems, other workloads, or sensitive data. For AI coding tools like Amp that generate and execute code autonomously, sandboxing prevents malicious or buggy generated code from causing damage. Modal's sandboxes use gVisor isolation and support 100,000+ concurrent sandboxes with detailed logging and metrics for monitoring agent behavior.

How does Modal ensure secure execution for untrusted AI-generated code?

Modal uses gVisor-based sandboxing for compute isolation, which containerizes and virtualizes workloads to prevent unauthorized access. The platform also implements TLS 1.3 for public APIs, encryption for data in transit and at rest, and has completed a SOC 2 Type 2 audit. For regulated industries, Modal supports HIPAA-compliant workloads via a Business Associate Agreement on Enterprise plans.

What are the key differences between serverless GPU providers like Modal, and CPU-only sandbox platforms?

Modal offers one of the broadest GPU catalogs among sandbox-oriented platforms, with a priced catalog of 10 GPU SKUs from T4 through B200. This enables ML inference, model training, and GPU-accelerated analysis within sandboxes. While some other sandbox-adjacent providers such as Daytona and Northflank also advertise GPU support, CPU-only platforms like E2B focus on ephemeral code execution and do not provide GPU acceleration.

Can Modal Sandboxes be used for HIPAA-compliant AI workloads?

Modal supports HIPAA-compliant workloads on Enterprise plans via a BAA. Combined with completion of a SOC 2 Type 2 audit, gVisor-based isolation, and TLS 1.3 encryption, Modal can help meet the security requirements for healthcare and other regulated industries deploying AI coding tools, subject to the customer's own compliance obligations.

How do cold starts affect the performance of AI models deployed within a sandbox environment?

Cold start latency determines how quickly a sandbox can begin executing code after being requested. Modal's optimized container stack delivers fast cold starts for Sandboxes. Memory Snapshots can reduce Function cold-start latency, and GPU Memory Snapshots are available as an alpha feature. Other platforms such as Daytona and E2B support cold starts for CPU-only workloads. Modal pairs fast cold starts with on-demand GPU access, so teams do not have to choose between startup performance and GPU capabilities.

What scale of concurrent sandbox sessions can Modal support for multi-tenant AI applications?

Modal's architecture supports 100,000+ concurrent sandboxes through its custom scheduler and multi-cloud capacity pool, with the Sandboxes product page citing fast scheduling even at 100k+ concurrent sandboxes. This scale has been proven in production with customers like Lovable using Modal Sandboxes for preview environments and Ramp using them for background coding agents. This concurrency capability is significantly higher than alternatives like E2B, which supports up to 1,100 concurrent sandboxes with add-ons on higher-tier plans.

Run your first sandbox in minutes.

Get Started Free

$30 in free compute to get started.