Infrastructure
AI agents tackling multi-day tasks need more than ephemeral containers that vanish after each invocation. Stateful sandboxes preserve filesystem state, memory, and running processes between sessions, enabling agents to work on complex projects without rebuilding their environment from scratch. Choosing the right sandbox platform determines whether your agents can maintain context across hours or weeks, resume work instantly, and execute untrusted code securely at scale.

Modal delivers serverless compute for secure AI agent sandboxes at massive scale, with on-demand GPU access for workloads requiring acceleration. The platform's custom-built infrastructure handles dynamically defined containers that can support 100k+ concurrent sandboxes with fast startup times.
Modal maintains SOC 2 Type II certification and supports HIPAA-compliant workloads on Enterprise plans via a Business Associate Agreement. The platform uses gVisor-based sandboxing for compute isolation, TLS 1.3 for public APIs, and encryption for data in transit and at rest. Modal documents vulnerability remediation SLAs and publishes a detailed security guide covering application, corporate, and infrastructure security practices.
Modal's AI-native platform page describes an AI-native container runtime, optimized filesystem behavior, a multi-cloud capacity pool, storage, data, and networking primitives, and observability. Modal also provides image-building APIs and scheduling and resource controls in its docs. The architecture supports dynamically defined sandboxes that can preserve state through Volumes and filesystem snapshots while scaling elastically based on demand. Observability for individual sandboxes enables monitoring and debugging of long-running agent sessions.
Best For: Teams building AI agents that need secure code execution at scale, with persistent state through Volumes and filesystem snapshots, broad GPU access, and enterprise-grade compliance for production deployments. Sandbox memory snapshots can additionally clone full sandbox state and are currently in alpha.
Northflank provides persistent sandbox infrastructure with unlimited session duration and self-serve BYOC (bring your own cloud) deployment across multiple cloud providers.
Northflank reports SOC 2 Type 2 compliance. Its public security page does not support HIPAA BAA availability and currently lists HIPAA as "No." The platform supports data residency requirements through BYOC deployment.
Northflank focuses on persistent workspaces that maintain state across sessions indefinitely. The platform provides full infrastructure alongside sandboxes, including databases and APIs, making it suitable for agents that need access to additional services.
Best For: Teams requiring unlimited session duration, BYOC deployment for data residency requirements, and a choice of isolation technologies.
E2B specializes in secure sandboxes for AI agents with Firecracker microVM isolation and cold starts for ephemeral code execution.
E2B offers 1-hour sessions on base plans and 24-hour sessions on Pro plans. Paused sandboxes can be retained for resumption, with the platform supporting up to 1,100 concurrent sandboxes on higher-tier plans.
E2B excels at ephemeral code execution patterns, spinning up isolated environments for agents to run generated code. E2B caps continuous running sessions by plan, but supports long-lived state via pause/resume; paused sandboxes can preserve full state beyond the continuous runtime window.
Best For: Teams building agents focused on code execution and testing with continuous sessions under 24 hours, particularly those needing Firecracker-level isolation and cold starts.
Blaxel positions itself as a perpetual sandbox platform for AI agents, with a focus on persistent "agent computers" that stay on standby and resume on demand.
Blaxel offers SOC 2 Type II, HIPAA BAA, and ISO 27001 certifications, providing comprehensive compliance coverage for enterprise deployments.
Blaxel emphasizes persistent state over ephemeral execution. The platform recommends treating sandboxes as persistent computers that retain context over time, benefiting agents that need continuity across workflows instead of clean-room execution on every task.
Best For: Teams building coding agents requiring persistent sandbox environments, resume from standby, and comprehensive compliance certifications.
Fly.io Sprites delivers persistent Linux VMs with checkpoint/restore capabilities designed specifically for AI agent workloads that need to maintain state across sessions.
Sprites are positioned as persistent VMs rather than ephemeral containers, designed for AI agents that need to preserve state between invocations. Sprites are designed for AI agent and coding workflows, including Claude Code-style persistent coding environments.
Best For: Individual developers and teams using Claude Code or similar agentic coding tools that need persistent Linux VMs with global edge deployment.
Daytona provides persistent development environments with Docker compatibility and strong editor integration.
Daytona offers SOC 2 Type I certification with HIPAA support. The platform supports on-premises deployment for organizations with data residency requirements.
Daytona focuses on workspace continuity, maintaining state across sessions with emphasis on developer experience through editor integration. The Docker-compatible runtime enables use of standard container images.
Best For: Development teams that prioritize editor integration and Docker compatibility, particularly those standardizing on containers for agent development workflows.
Runloop provides sandbox infrastructure with integrated AI benchmarking capabilities, designed for teams evaluating and fine-tuning coding agents.
Runloop reports SOC 2 Type II and GDPR compliance, and HIPAA-eligible architecture with BAA availability for eligible workloads, with VPC deployment for additional data isolation.
Runloop combines sandbox execution with agent evaluation tooling, enabling teams to run benchmarks alongside production workloads. The platform supports both development iteration and production deployment.
Best For: Teams actively evaluating and fine-tuning coding agents who need integrated benchmarking alongside secure sandbox execution.
Modal's architecture addresses the core requirements of stateful agent sandboxes through purpose-built infrastructure. The platform's AI-native container runtime, optimized filesystem behavior, and scheduling controls are built for the unique demands of AI agent workloads: secure code execution, dynamic scaling, and efficient state preservation through Volumes and filesystem snapshots.
While some platforms cap concurrent sessions at 1,100, Modal's sandbox infrastructure supports 100k+ concurrent sandboxes with fast startup times. This scale enables enterprise deployments where hundreds or thousands of agents operate simultaneously without capacity constraints.
Long-running agents often need to call upon GPU acceleration for ML inference, code analysis, or model fine-tuning. Modal offers the broadest GPU lineup among the platforms covered in this guide, from T4 for lightweight inference through H200 and B200 for large-scale computation. Agents can dynamically request GPU resources as workloads demand, without pre-provisioning.
For Modal Functions, Memory Snapshots capture CPU memory, with alpha GPU Memory Snapshots additionally capturing GPU state, reducing initialization-heavy cold starts. For long-running Modal Sandboxes, state is preserved through Volumes and filesystem snapshots, which persist indefinitely until deleted. Sandbox memory snapshots can clone full sandbox state and are currently in alpha, so filesystem snapshots and Volumes are the recommended primitives for agents with heavyweight dependencies or model loading requirements.
Production agent deployments require enterprise-grade security. Modal maintains SOC 2 Type II certification, supports HIPAA-compliant workloads via BAA on Enterprise plans, and implements comprehensive security practices including gVisor sandboxing, TLS 1.3, and encryption at rest. The security documentation details vulnerability remediation SLAs and shared responsibility models.
Modal takes a code-first approach, with SDKs in Python, TypeScript, and Go for creating Sandboxes, calling Modal Functions, and managing resources. Code running inside a sandbox is not limited to a single language; a sandbox can run whatever runtime or language the workload requires. These SDKs eliminate YAML configuration files and infrastructure management overhead. Teams define sandbox environments, compute requirements, and scaling behavior directly in code, enabling rapid iteration on agent architectures. The platform handles container builds, scheduling, and auto-scaling automatically.
Modal powers cloud infrastructure for over 10,000 teams, including AI companies building production agent systems. Lovable uses Modal Sandboxes as preview environments for generated apps and websites, and Ramp runs background coding agents on Modal Sandboxes that generate code changes and write them back into commits or pull requests. This track record demonstrates the platform's ability to handle enterprise-scale agent workloads reliably, from prototype through production deployment.
For teams building AI agents that require secure stateful execution, persistent state through Volumes and filesystem snapshots, broad GPU access, and enterprise compliance, Modal's combination of AI-native infrastructure and proven scale makes it the clear choice for long-running agent sessions.
Explore the Modal Sandboxes documentation to get started.
Explore the Modal Sandboxes documentation to get started.
View Sandboxes DocsA stateful sandbox is an isolated compute environment that preserves filesystem state, installed dependencies, and execution context between sessions. Unlike ephemeral containers that reset after each invocation, stateful sandboxes enable AI agents to work on multi-day tasks without losing progress. This persistence is critical for agents tackling complex projects that span hours or weeks, such as codebase refactoring, data processing pipelines, or iterative research tasks.
Modal uses gVisor-based sandboxing for compute isolation, providing syscall-level isolation that adds a strong layer between workloads and host systems as part of a defense-in-depth approach. The platform implements TLS 1.3 for all API traffic, encrypts data in transit and at rest, and maintains SOC 2 Type II certification. For regulated industries, Modal supports HIPAA-compliant workloads via Business Associate Agreements on Enterprise plans.
Yes. Modal's sandbox infrastructure handles CPU-based code execution at scale while enabling agents to call upon GPUs when workloads require acceleration. The platform supports GPU options from T4 through B200, allowing agents to dynamically access the compute resources they need for ML inference, model fine-tuning, or compute-intensive analysis.
Purpose-built AI infrastructure eliminates configuration overhead and operational complexity. Modal's AI-native container runtime, optimized filesystem behavior, and scheduling controls are built for agent workloads, providing fast cold starts, state preservation through Volumes and filesystem snapshots, and elastic scaling without manual capacity management. Teams define everything in code through native SDKs rather than wrestling with YAML configuration or Kubernetes clusters.
Modal supports HIPAA-compliant workloads on Enterprise plans through Business Associate Agreements. The platform implements security controls including gVisor sandboxing, encryption in transit and at rest, TLS 1.3 for APIs, and vulnerability remediation SLAs. The security documentation details the shared responsibility model for compliant deployments.
Session duration limits determine whether agents can work continuously on multi-day tasks or must implement complex checkpointing logic. Some platforms cap sessions at 24 hours, requiring agents to save state externally and rebuild environments after timeout. Modal Sandboxes can be configured to run up to 24 hours; for workflows beyond that, Modal recommends filesystem snapshots to preserve state and restore into a subsequent sandbox, allowing agents to resume work without rebuilding the full environment.