Infrastructure
Hermes Agent requires a secure, scalable environment to execute AI-generated code safely. As autonomous agents write and run code independently, selecting the right sandbox infrastructure determines whether your Hermes deployment can handle untrusted code execution, scale to meet demand, and maintain the isolation necessary for production workloads.

This guide examines seven code execution sandboxes for Hermes in 2026, starting with Modal, the serverless compute platform with native Hermes backend support and proven scale at over 1 billion sandboxes executed to date.
Modal delivers serverless compute with native Hermes backend support, gVisor-based isolation, and on-demand GPU access. The platform is configured as `terminal.backend: modal` in Hermes with `MODAL_TOKEN_ID` and `MODAL_TOKEN_SECRET` authentication, making it the most straightforward option for Hermes deployments requiring secure code execution at scale.
Modal has completed a SOC 2 Type II audit and is SOC 2 Type II compliant, and supports HIPAA-compliant workloads on Enterprise plans via a BAA. The platform uses gVisor-based sandboxing for compute isolation, TLS 1.3 for public APIs, and encryption for data in transit and at rest.
Modal powers production workloads for AI companies running agent infrastructure:
Best For: Teams running Hermes agents that need native backend support, configurable session timeouts up to 24 hours with snapshot-based continuation, GPU access for ML workloads, and production-proven scale.
Daytona provides persistent development environments that support cold starts and native Hermes backend support. The platform is configured as `terminal.backend: daytona` with `DAYTONA_API_KEY` authentication.
Daytona focuses on workspace continuity rather than ephemeral execution. Sandboxes auto-stop after 15 minutes of inactivity by default but can be configured for indefinite runtime by disabling auto-stop, benefiting Hermes agents that need to preserve cached dependencies or intermediate results. Persistent filesystem state should not be conflated with uninterrupted live-process execution.
Best For: Teams building Hermes agents that require persistent development environments, cold starts, and workspace continuity across sessions.
E2B specializes in secure sandboxes for AI agents using Firecracker microVM isolation. The platform claims adoption by 94% of Fortune 100 companies for frontier agentic workflows, with customers including Perplexity, Hugging Face, and Groq.
E2B excels at ephemeral code execution with strong isolation guarantees. The platform supports up to 1,100 concurrent sandboxes on higher-tier plans. Sandboxes can run continuously for up to 24 hours on Pro plans and 1 hour on Base plans; longer workflows can use pause/resume, which resets the runtime window while preserving full state.
Best For: Teams building Hermes agents that prioritize kernel-level isolation over session duration, particularly those integrating with LangChain, OpenAI, or Anthropic tooling.
Northflank provides production-grade AI infrastructure with multiple isolation technology options and self-serve bring-your-own-cloud (BYOC) deployment. Northflank documents support for Firecracker, Kata Containers, Cloud Hypervisor, and gVisor, plus BYOC deployments across major clouds and on-premises environments.
Northflank's unique multi-isolation support allows teams to match security requirements to specific workloads. This flexibility benefits regulated industries where different isolation models may be required for different data sensitivity levels.
Best For: Enterprise teams building Hermes agents in regulated industries that need BYOC flexibility, compliance certifications, and configurable isolation technologies.
Vercel Sandbox provides isolated code execution environments in temporary Linux microVMs powered by Firecracker. Vercel Sandbox is generally available and integrates tightly with Vercel's deployment ecosystem.
Vercel Sandbox enforces 45-minute session limits on Hobby tier and up to 5 hours on Pro/Enterprise plans. This constraint requires Hermes workflows to be designed around session boundaries.
Best For: Teams already using Vercel's ecosystem that need isolated code execution for Hermes agents with shorter workflow durations.
Cloudflare Sandbox delivers code execution using Cloudflare Containers coordinated by Workers and Durable Objects, with each sandbox running in an isolated Linux container.
Cloudflare Sandbox defaults to sleeping after 10 minutes of inactivity, configurable via `sleepAfter`, and can be kept alive with `keepAlive: true`; filesystem and process state are lost when the container stops. This default behavior suits Hermes agents executing short-lived tasks rather than long-running workflows.
Best For: Teams building Hermes agents optimized for latency-sensitive, short-duration tasks within the Cloudflare ecosystem.
Fly.io Sprites launched in January 2026 as persistent hardware-isolated environments specifically for AI coding agents. The platform uses Firecracker microVMs with a unique cost model that charges nothing when sandboxes are idle.
Fly positions new Sprite creation as a core capability. Warm Sprites resume from hibernation, making the platform suitable for intermittent Hermes workflows with natural pauses.
Best For: Teams building Hermes agents with intermittent execution patterns that benefit from persistent storage and cost optimization during idle periods.
Among the managed cloud sandbox providers covered here, Modal is one of only two platforms documented by Hermes as a cloud/serverless `terminal.backend`, configured simply through environment variables. This native integration eliminates the workarounds required when using other managed platforms, reducing setup complexity and ensuring compatibility with Hermes updates.
Modal has executed over 1 billion sandboxes and powers cloud infrastructure for over 10,000 teams. This production track record demonstrates the platform's ability to handle enterprise-scale Hermes deployments. Lovable, one of Modal's customers, runs "tens of thousands of app creation sessions in an instant" using Modal's sandbox infrastructure.
Modal Sandboxes support configurable runtimes up to 24 hours. For workflows that need to continue beyond a single Sandbox lifetime, Modal provides Filesystem Snapshots that preserve Sandbox filesystem state indefinitely until deleted, enabling stateful continuation across subsequent Sandbox runs.
Modal supports a broad GPU catalog for GPU-accelerated workloads, including T4, L4, A10, L40S, A100 variants, RTX PRO 6000, H100/H100!, H200, and B200/B200+. Hermes agents can call upon these GPUs when workloads require acceleration for ML inference, code analysis, or model-based code generation.
Modal's AI-native container runtime supports 100k+ concurrent sandboxes with sub-second scheduling and strong cold-start performance on custom images. The platform's custom file system, container runtime, and scheduler are optimized specifically for the elastic scaling patterns that agent workloads demand.
With a completed SOC 2 Type II audit, HIPAA support via a BAA on Enterprise plans, and gVisor-based compute isolation, Modal meets the security requirements that production Hermes deployments demand. The platform encrypts data in transit and at rest and uses TLS 1.3 for public APIs.
For teams deploying Hermes agents that need native backend support, configurable sessions up to 24 hours with snapshot-based continuation, GPU access, and production-proven infrastructure, Modal's combination of scale, security, and AI-native architecture makes it the clear choice.
Get started with Modal Sandboxes for your Hermes deployment.
Explore the Modal Sandboxes documentation to get started.
View Sandboxes DocsEffective sandboxes provide secure isolation to run untrusted AI-generated code, fast cold starts for responsive agent interactions, and scalable infrastructure to handle concurrent sessions. Modal addresses all three with gVisor isolation, sub-second scheduling, and support for 100k+ concurrent sandboxes.
Sandboxes isolate code execution from host systems using technologies like gVisor (Modal), Firecracker microVMs (E2B, Northflank), or isolated Linux containers (Cloudflare). This isolation prevents AI-generated code from accessing unauthorized resources or affecting other workloads, critical when agents execute code autonomously.
Among the managed cloud sandbox providers covered here, Modal and Daytona are the only ones documented by Hermes as cloud/serverless `terminal.backend` options. Modal is configured with `terminal.backend: modal` using token-based authentication, while Daytona uses `terminal.backend: daytona` with API key authentication. Hermes also supports local, Docker, SSH, and Singularity/Apptainer backends.
gVisor (used by Modal) implements a user-space kernel that intercepts system calls, providing strong isolation with container-like deployment simplicity. Firecracker (used by E2B, Vercel, Northflank, Fly.io) creates lightweight microVMs with hardware-level isolation. Both approaches protect against untrusted code execution with different performance and security characteristics.
Session limits determine how long an agent can run continuously. Modal Sandboxes support configurable timeouts up to 24 hours, with Filesystem Snapshots for stateful continuation beyond that boundary. Daytona auto-stops idle sandboxes after 15 minutes by default but can be configured for indefinite runtime. E2B runs up to 24 hours on Pro plans and 1 hour on Base plans, with pause/resume that resets the runtime window while preserving state. Cloudflare Sandbox defaults to sleeping after 10 minutes of inactivity and can be kept alive with `keepAlive: true`.
Modal, Daytona, and Northflank publicly market GPU-capable workflows in this category. Modal supports a broad GPU catalog including H100, H200, B200 and A100 variants. Vercel Sandbox and Cloudflare Sandbox documentation focuses on CPU and container execution.