We've partnered with Mistral to bring you Day 0 support for Mistral 3, with GPU-snapshot-optimized performance.
Turns out, good devex for agents looks a lot like good devex for humans.
Learn how Reducto used GPU memory snapshotting and flexible autoscaling to build fast multi-model pipelines.
Turns out, good devex for agents looks a lot like good devex for humans.
Never block the GPU.
How we built a real-time voice bot on Modal's distributed serverless platform.
We've partnered with Mistral to bring you Day 0 support for Mistral 3, with GPU-snapshot-optimized performance.
Welcome to another round of Modal Product Updates! Here's what's new this month.
We've collaborated with Datalab, the creators of Marker and Surya, to make it faster than ever to deploy document intelligence workflows.
Learn how Reducto used GPU memory snapshotting and flexible autoscaling to build fast multi-model pipelines.
How Decagon and Modal made real-time voice AI possible, combining fine-tuned small models with a re-engineered inference runtime for sub-second latency.
Zencastr scaled up to 1,500 concurrent GPUs on Modal to process hundreds of years of podcast audio in just a few days. Today they run transcription, speaker detection, and audio enrichment for millions of podcast episodes on Modal, giving them cost efficiency, fast iteration, and zero DevOps overhead.