Featured examples

Serving Diffusion models

Serve Stable Diffusion XL on Modal with a number of optimizations for blazingly fast inference.

Serving Diffusion models

Serverless TensorRT-LLM

Run large language models at SOTA performance by building a TRT-LLM engine on Modal.

Serverless TensorRT-LLM

Stable Diffusion fine-tuning with Dreambooth

Fine-tune a Stable Diffusion model on images of your pet using Dreambooth.

Stable Diffusion fine-tuning with Dreambooth

Voice chat with LLMs

Build a real-time voice chat app by combining speech-to-text, an LLM, and text-to-speech.

Voice chat with LLMs

Fast podcast transcriptions

Build an end-to-end podcast transcription app that leverages dozens of containers for super-fast processing.

Fast podcast transcriptions

Document OCR job queue

Use Modal as an infinitely scalable job queue that can service async tasks from a web app.

Document OCR job queue

Parallel processing of Parquet files on S3

Analyze data from the Taxi and Limousine Commission of NYC in parallel.

Parallel processing of Parquet files on S3

Retrieval-Augmented Generation for Q&A

Build a question-answering web endpoint that can cite its sources.

Retrieval-Augmented Generation for Q&A

Hacker News Slackbot

Use Modal to deploy a cron job that periodically queries Hacker News for new posts, and posts the results to Slack.

Hacker News Slackbot

Real-time Object Detection

Create a web endpoint that leverages HuggingFace models to do object detection in real-time.

Real-time Object Detection

ControlNet playgrounds

Play with all 10 demo Gradio apps from the ControlNet project.

ControlNet playgrounds