Substack is a popular online platform for writers to publish newsletters, with over 17k writers and $300M in paid subscription volume.
Substack employs ML for various purposes, including spam detection, newsletter recommendations, audio transcription, sentiment analysis, and image generation. For nearly all these models Substack has moved both training and deployment from AWS SageMaker to Modal.
The challenges deploying AI and ML before Modal
Previously, Substack’s training and deployment pipelines were built on AWS SageMaker and orchestrated with Airflow. Adding or updating models was a slow and painful process for a few reasons.
One, the developer experience on SageMaker was convoluted. Engineers had to navigate to the SageMaker product, create a notebook, specify machine requirements, and wait for that machine to turn on, all before a single line of code could be written. Not to mention the difficulty of juggling multiple remote environments—from the Jupyter notebooks to the SageMaker training machines to the final production infra.
Two, collaborating was difficult. Mike Cohen, Head of AI and MLE at Substack, found that his team often had to duplicate code across various notebooks due to the way SageMaker would package code before sending it off to training machines. It was hard to share components across similar projects.
Three, containers took forever to start up. Engineers had to wait 5+ minutes for training machines to spin up, impeding their ability to iterate quickly and incurring unnecessary costs. They tried AWS’s Serverless Inference product thinking it’d be faster, but it lacked GPU support and still proved to be too slow.
Making everything faster on Modal
Mike first tried Modal to run a transcription model and quickly realized how much more natural it felt to iterate on Modal. All development and testing was done in the most obvious places—his code editor and the command line. Getting an inference endpoint up and running took just an hour to implement, and Modal’s container autoscaling helped them parallelize transcription workloads out of the box.
Substack decided to migrate over training and deployment for existing models as well. They implemented a full fine-tune → validation → deployment workflow by leveraging Modal’s native storage primitives. In this workflow, they:
- Fetched training data from Snowflake, wrote it to S3, then mounted the S3 bucket to Modal.
- Ran the model fitting with Modal functions.
- Saved the weights to a Modal network volume.
- Ran model validation with another Modal function using those weights.
- Deployed the new model on Modal and updated a Modal key-value store to track which model version was in production.
By moving to Modal, Substack has been able to develop and deploy ML workflows with greater speed and flexibility than ever before.