How to Run Untrusted Code Safely in Production with AI Sandboxes

Key Takeaways

gVisor-based sandboxing intercepts system calls in user space via an "application kernel" that handles all workload syscalls without passing them directly to the host kernel, significantly reducing the attack surface for kernel-level exploits
Sanitization alone is insufficient for AI-generated code; sandboxing is a required security control, not an optional enhancement
Modal Sandboxes support 50,000+ concurrent sessions with gVisor isolation, SOC 2 Type II certification, and HIPAA-compliant workloads on Enterprise plans via a BAA

Understanding the Risks: Why Untrusted Code Needs Sandboxing

When AI systems generate code, you're executing instructions written by a model that learned from vast datasets including potentially malicious examples. Unlike human-written code that goes through review processes, AI-generated code often executes immediately, creating a direct path from model output to system access. The threat vectors are concrete:

System compromise: AI-generated code can attempt privilege escalation, accessing resources beyond its intended scope
Data exfiltration: Malicious or confused code might read sensitive files or environment variables and transmit them externally
Resource exhaustion: Infinite loops, memory leaks, or fork bombs can crash host systems
Lateral movement: Code that escapes container boundaries can attack other workloads or infrastructure
Supply chain injection: AI might generate code that downloads and executes additional untrusted payloads

These risks are not theoretical. In early 2026, PromptArmor disclosed a vulnerability in Snowflake's Cortex Code CLI (coordinated public disclosure March 16, 2026; fix shipped in version 1.0.25 on February 28, 2026). The incident demonstrated how indirect prompt injection combined with weak command validation allowed AI-generated instructions to bypass human-in-the-loop approval and escape the CLI's sandbox mode, enabling arbitrary code execution and unauthorized access to cached credentials. Separately, CVE-2024-21626, the "Leaky Vessels" runc vulnerability, showed how a file descriptor leak in the container runtime could enable container escape and host filesystem access. Together, these incidents illustrate that both AI agent frameworks and underlying container runtimes present exploitable attack surfaces. Without proper sandboxing, every AI code execution becomes a potential security incident.

What is an AI Sandbox? Redefining Secure Execution for ML Workloads

An AI sandbox is an isolated execution environment specifically designed to run code generated by AI systems without risking production infrastructure. Unlike traditional sandboxes built for testing, AI sandboxes must handle unpredictable outputs from language models that might generate anything from harmless scripts to sophisticated exploitation attempts. These environments operate on a simple principle: assume all AI-generated code is potentially hostile, and constrain its capabilities accordingly.

The Role of Isolation Technologies (gVisor, VMs, Containers)

Different isolation technologies offer varying security-performance tradeoffs:

Standard Containers (Docker): Share the host kernel, providing namespace isolation but leaving kernel vulnerabilities exposed. As NIST SP 800-190 notes, containers involve OS-level virtualization where multiple applications share a host kernel. Container escapes remain possible through kernel or runtime vulnerabilities, as demonstrated by CVE-2024-21626.

gVisor: A user-space "application kernel" that intercepts system calls before they reach the host kernel. gVisor's Sentry component handles all workload syscalls and never passes any system call directly to the host; the Sentry itself is restricted to a small allowlist of roughly 68 host syscalls, significantly reducing the kernel attack surface. Overhead is highly workload-dependent: CPU-bound tasks may see modest impact, while syscall-heavy, filesystem-intensive, or network-heavy workloads can experience more substantial slowdowns, as documented in peer-reviewed benchmarking. Modal uses gVisor for its sandbox isolation.

Firecracker microVMs: Full hardware virtualization in lightweight packages, offering strong isolation with higher resource consumption than containers.

Kata Containers: Combines container orchestration with VM isolation, offering a middle ground between standard containers and full VMs by running each container inside a lightweight virtual machine while preserving the container API.

Key Characteristics of an Effective AI Sandbox

Production AI sandboxes require specific capabilities beyond basic isolation:

Dynamic provisioning: Spin up isolated environments on-demand without pre-allocation
Resource limits: Enforce CPU, memory, and storage caps to prevent resource exhaustion
Network controls: Restrict or eliminate network access based on workload requirements
Filesystem isolation: Prevent access to host filesystems while enabling necessary data inputs
Time limits: Automatically terminate long-running processes before they cause damage
Audit trails: Log all system calls and resource access for security review

Overcoming Performance and Scalability Challenges with AI Sandboxes

Security isolation historically meant performance penalties that made sandboxed execution impractical for production workloads. Modern platforms have largely solved this tradeoff through infrastructure engineering.

Achieving Fast Cold Starts for Sandbox Environments

Cold start latency determines whether sandboxed execution feels responsive or sluggish. When an AI agent needs to run generated code, waiting several seconds for environment initialization kills user experience. Leading platforms achieve fast cold starts through several techniques:

Memory snapshotting: Pre-warm common execution states and restore them instead of initializing from scratch.

Optimized filesystems: Custom filesystem implementations that prioritize fast reads for common dependencies.

Modal's core platform emphasizes memory snapshotting and a custom filesystem specifically optimized for these fast-startup patterns.

Scaling Sandboxed Execution to Massive Concurrency

Production AI applications generate thousands of code execution requests simultaneously. Each coding agent interaction, each batch job iteration, and each user request might spawn sandboxed execution. Scaling requirements include:

Rapid provisioning: Create new sandbox instances in milliseconds, not seconds.

Efficient scheduling: Route requests to available capacity without queuing delays.

Resource pooling: Share underlying infrastructure across sandboxes while maintaining isolation.

Automatic scaling: Expand and contract capacity based on demand without manual intervention.

Modal Sandboxes demonstrate these capabilities with support for 50,000+ concurrent sessions, scaling from zero to thousands of containers as demand fluctuates.

Networking and Data Flow: Controlling Ingress and Egress in Sandboxes

Network access represents the highest-risk capability for untrusted code. An AI-generated script with unrestricted network access can exfiltrate data, download additional payloads, or attack external systems. Network isolation should be the default configuration. Sandboxes start with no network access, and you explicitly enable only required connections. Ingress controls determine how external requests reach sandboxed services. For AI sandboxes running web services, configure specific ports and protocols rather than exposing arbitrary network surfaces. Tunneling capabilities enable controlled access when needed. Modal's sandbox networking documentation covers how to configure tunnels and port exposure for interactive processes while maintaining security boundaries. Data flow logging tracks all network activity for audit purposes. Even when connections succeed, logging enables detection of unexpected communication patterns indicating compromise.

Observability and Auditing: Monitoring Untrusted Code Execution

You can't secure what you can't see. Comprehensive observability transforms sandbox security from reactive incident response to proactive threat detection.

Real-time logging: Captures stdout, stderr, and system events from every sandbox execution. Stream logs to centralized platforms for correlation and alerting.

Metrics collection: Tracks resource utilization patterns. Sudden spikes in CPU, memory, or network usage might indicate cryptomining, data processing, or attack activity.

Distributed tracing: Follows requests across sandbox boundaries. When AI-generated code calls external services or spawns additional processes, tracing maintains visibility.

Audit logs: Provide compliance evidence and forensic capability. Modal's Enterprise plan includes audit logs, and Modal can export audit logs to an OpenTelemetry provider alongside function logs and container metrics.

Alerting rules: Define thresholds for automatic notification. Configure alerts for failed executions, resource limit violations, and anomalous behavior patterns.

The Shared Responsibility Model: Your Role in Sandbox Security

Even with managed sandbox platforms handling infrastructure security, you maintain responsibility for application-level security decisions.

Your responsibilities include: Defining appropriate resource limits for each workload type, configuring network access policies aligned with actual requirements, managing secrets and credentials passed to sandboxed code, reviewing and responding to security alerts, maintaining secure coding practices in wrapper code, and implementing input validation before passing data to sandboxes.

Platform responsibilities typically cover: Infrastructure security and patching, isolation technology maintenance, compliance assurances (SOC 2 Type II audits) and support for HIPAA-compliant usage, network security at the platform level, and physical data center security.

Modal articulates a shared responsibility model covering backup, recovery, and availability, with vulnerability remediation SLAs targeting 24 hours for critical issues and one week for high-severity findings.

Why Modal's Sandbox is Worth Checking Out for Secure AI Code Execution

Modal built its sandbox capabilities for running untrusted user or agent code, including AI-generated code, making it a strong fit for teams running inference, training, and batch processing workloads that need secure code isolation.

Technical foundation: Modal's Sandboxes are built on gVisor, which provides strong isolation and custom logic to block malicious system calls. gVisor's Sentry never passes workload system calls through to the host, blocking the kernel-level exploits that standard containers leave exposed.

Performance at scale: Modal Sandboxes deliver fast cold starts while supporting 50,000+ concurrent sessions. The platform also provides memory snapshotting, a faster filesystem, and gVisor-based isolation as core capabilities.

Platform integration: Unlike standalone sandbox services, Modal integrates sandboxed execution with its broader AI infrastructure, including inference, training, and batch processing. Run inference, training, sandboxes, batch, and notebooks on the same Modal platform, with native observability and telemetry integrations.

Compliance readiness: Modal maintains SOC 2 Type II certification with no deviations found. Modal supports HIPAA-compliant workloads on Enterprise plans via a BAA for customers handling PHI.

Developer experience: Modal's code-first SDK lets you define sandbox environments in code rather than configuration files, and Modal Sandboxes support all programming languages.

For teams already using Modal for ML workloads, adding sandboxed execution for AI-generated code requires minimal additional infrastructure. For teams evaluating sandbox platforms specifically, Modal offers a compelling combination of security, performance, and platform breadth.

Check the sandboxes documentation to explore implementation patterns.

View Sandboxes Docs

Frequently asked questions

What is the primary benefit of using an AI sandbox for untrusted code?

An AI sandbox provides isolated execution that limits untrusted code's ability to access host systems, sensitive data, or other workloads, even when that code behaves maliciously or unexpectedly. Without sandboxing, every AI code execution creates potential for system compromise, data exfiltration, or lateral movement to other infrastructure components. The NVIDIA AI Red Team has documented how unsandboxed AI code execution workflows enable remote code execution via prompt injection, reinforcing that sandboxing is a required security control.

How does gVisor enhance the security of containerized sandboxes?

gVisor implements a user-space "application kernel" that intercepts system calls before they reach the host kernel, reducing the kernel attack surface available to untrusted workloads. The gVisor Sentry handles all workload syscalls and restricts its own host interactions to roughly 68 allowlisted syscalls out of over 350 in the Linux kernel. Overhead is workload-dependent: CPU-bound tasks see minimal impact, while syscall- and I/O-heavy workloads can experience more significant slowdowns, as peer-reviewed benchmarking confirms.

Can Modal Sandboxes be used for both AI-generated code and general untrusted code?

Yes, Modal Sandboxes support any untrusted code execution use case, not just AI-generated code. The platform's gVisor isolation, network controls, and resource limits apply equally to code from any source, whether generated by LLMs, submitted by users, or pulled from external repositories.

What kind of performance can I expect from a well-implemented AI sandbox solution?

Modern AI sandbox platforms achieve fast cold starts, and Modal documents memory snapshotting, a faster filesystem, and gVisor-based isolation as core platform capabilities. Modal specifically supports scaling to 50,000+ concurrent sessions while maintaining isolation guarantees, enabling production workloads that would have been impractical with older sandbox technologies.

What compliance standards are relevant for running untrusted code in production environments?

SOC 2 Type II is a commonly requested assurance report in many enterprise procurement processes, covering controls related to security, availability, and confidentiality as defined by the AICPA trust services categories. Requirements vary by customer and industry. Modal maintains SOC 2 Type II certification with no deviations found. For healthcare and regulated industries, when a covered entity engages a service provider as a business associate involving PHI, HIPAA requires a Business Associate Agreement defining permitted uses, disclosures, and safeguards. Modal supports HIPAA-compliant workloads on Enterprise plans through BAAs for customers handling protected health information.