Fast, cheap transcription and more

Fine-tune and deploy popular open-source speech models like Whisper.

3D gradient geometric shapes
Suno logo

"Suno has developed proprietary state-of-the-art models that generate music and speech using AI. Modal's superb developer experience enables our team to ship new models to production quickly, and with and confidence we'll scale to thousands of simultaneous users."

Georg Kucsko, Co-Founder
Phonic logo

"At Phonic, we train our own proprietary models for audio generation. We moved all our large-scale audio processing batch jobs to Modal. Our engineers are ecstatic with the result – we can run at a much larger scale than before, no longer have to babysit our batch jobs, and we can ship much faster."

Moin Nadeem, Co-Founder
Substack logo

"When Substack launched a feature for AI-powered audio transcriptions. The data team picked Modal because it makes it easy to write code that runs on 100s of GPUs in parallel, transcribing podcasts in a fraction of the time."

Mike Cohen, Head of Data

Cheap, efficient transcription

Get Started
Outperform managed APIs icon

Outperform managed APIs

100x faster and cheaper than proprietary transcription APIs
Scale on demand icon

Scale on demand

Deploy Whisper, Parakeet, or even your own custom speech-to-text model
Safe code execution visualization

Build your own AI voice

Get Started
Seamless integrations icon

Seamless integrations

Integrate with your favorite voice AI frameworks like Pipecat and LiveKit
Rapid response times icon

Rapid response times

Achieve 100ms latency using WebRTC on Modal
Programmable container environment

Ship your first app in

minutes.

$30 / month free compute

Modal Logo