Modal's Series C: Raising $355M at a $4.65B valuation

We’ve raised $355 million after growing fivefold since September, surpassing $300 million in annualized revenue. Our valuation is $4.65B post-money in a round led by General Catalyst and Redpoint, with Menlo, Bain Capital Ventures, and Accel joining as new investors. All our existing major investors participated as well, doubling down on their conviction in Modal.

A new infrastructure layer for the AI era

We started Modal because the cloud built for traditional web applications was never going to fit AI workloads. This was clear to us before the GenAI revolution and it becomes even more true as models and techniques advance.

Modal is a cloud built for AI. Not a single-purpose GPU cloud, but a platform with the right primitives for developers to build a very wide range of applications. Today, this looks like low-latency elastic inference, dynamic agent runtimes, reinforcement learning, batch jobs at massive scale, and much more.

Frontier APIs to model ownership

From digital-natives like DoorDash to AI-native companies like Reducto, the teams pulling ahead are taking ownership of their models. They're fine-tuning with their own data, running RL, and tuning inference for their own latency, throughput, and cost needs. Open-weight models from DeepSeek, Qwen and others have reached production quality, and inference engines like vLLM and SGLang have matured alongside them. For the first time, the full stack to own and serve your models is there, without sacrificing capability.

“Modal powers both our reinforcement learning infrastructure and production inference. Millions of sandboxes on one end, real-time serving on the other. All on the same platform. ”

— Scott Wu, CEO, Cognition

“Decagon was able to achieve a p90 latency of 342ms, well below the sub-second range required for natural customer conversations — delivering speed, efficiency, and enterprise-scale reliability”

— Research Team, Decagon

Agents need better execution environments

In 2023, we started seeing users run AI-generated code on Modal. It was clear this would become a universal need, so we built Sandboxes, isolated environments for untrusted code, as a first-class primitive. It took two years for the explosion to happen.

In the last six months, it's become clear: agents are going to be everywhere, and they're far more powerful when they have a runtime to operate in. DoorDash is building AI agents for merchants, coding agents like Ramp's Inspect author 70% of merged PRs, RL workloads run thousands of environments in parallel, and autoresearch agents run their own training experiments at scale. Over 1 billion sandboxes have been launched on Modal.

“Sandboxes are one of the most important building blocks for Reinforcement Learning. Out of everyone, Modal was clearly very flexible, structured in a way where we could build complex environments, really focused on performance and reliability.”

— Yash Patil, CEO, Applied Compute

“As we scale agentic commerce for local businesses, we need a highly efficient path to production with full harness control, scale, and reliability. We’re excited to evaluate Claude Managed Agents for this next step, building on our AI infrastructure with Modal.”

— Andy Fang, CTO, DoorDash

The shape of AI keeps expanding

Modal is a general compute platform built for the underlying needs of AI workloads: elastic compute, safe isolation, and programmatic control. Developers compose them into very different applications. Physical Intelligence runs real-time inference for live robots. Chai Discovery scales drug discovery pipelines from protein embeddings to antibody design. Suno generates millions of songs a day, scaling to thousands of GPUs and back to near-zero. Same primitives, completely different shapes.

“We use Modal to run edge inference with <10ms overhead and batch jobs at large scale. Our team loves the platform for the power and flexibility it gives us.”

— Brian Ichter, Co-founder, Physical Intelligence

“It’s not just a time savings, it’s the mental overhead that disappears. With Modal, we add a few decorators to a function we need to scale, forget about them, and they just work”

— Kevin Wu, ML Researcher, Chai Discovery

What we’re building next

We’ve spent the last five years going very deep on technology, including building our own storage and compute layer from the ground up. This has enabled us to achieve outcomes that seemed impossible, e.g. improve cold starts by 100x with GPU snapshotting, elastic low-latency inference globally, and scaling from 0 to 1,000 GPUs in minutes (or even seconds) without reservations by pooling capacity in hundreds of data centers all over the world.

Because we own the full stack, we can keep compounding those advantages to deliver a better and better experience for developers.

That foundation is what makes the next phase possible. Here's where we're going:

Low-latency inference at scale.

The bar for production inference keeps moving, and we're doubling down on what lets teams iterate fast: better serving primitives, sharper observability, and continued investment in the open inference stack. We've assembled a team of inference engineers contributing to Flash Attention, vLLM, SGLang, and more, because performance gains should flow back to the community building on the same engines.

Collapsing the training and inference loop.

Reinforcement learning is a hard infrastructure problem. Multi-node training, elastic inference, and sandboxes already have first-class support on Modal, which makes the full RL loop a natural fit. Our users are already seeing Pareto-efficient outcomes in quality, cost, latency, and throughput. We want to make the complete model training lifecycle, from first fine-tune to production serving, accessible to far more teams.

The compute layer for agents.

Sandboxes already drive more than a third of our revenue, and customers keep pushing us for more. We're expanding the Sandbox surface with new capabilities and the scale to run millions in parallel. Simultaneously, we recognize agentic development is here. Modal is code, which makes it already a great place for agents to work. We're going to keep improving here, starting by shipping granular RBAC so customers can give agents capability without risk.

The AI infrastructure layer is just getting started. So are we.

If you’d like to join our 120+ team across NY, SF and Stockholm, check out our open roles here.