Modal Articles

Run gpt-oss, OpenAI's new open weights model. Run now

Modal Articles

August 6, 2025

The Top Open-Source Text to Speech (TTS) Models

This article explores the top open-source TTS models, based on Hugging Face’s trending models and insights from our developer community.

August 5, 2025

The Top Open Source Speech-to-Text (STT) Models in 2025

Exploring the top open-source STT models based on Hugging Face's trending models and the Open ASR Leaderboard.

July 29, 2025

Whisper vs Deepgram

Compare the most popular open-source ASR model with the most popular proprietary ASR platform.

July 17, 2025

Top AI Code Sandbox Products in 2025

Compare the top AI code sandbox products available in 2025, focusing on five vendors: Modal, E2B, Together, Fly, and Daytona.

July 14, 2025

How much does it cost to run NVIDIA B200 GPUs in 2025?

Learn how much it will costs to run B200s across major cloud providers

May 22, 2025

Azure Functions pricing: Consumption vs. Flex Consumption

Learn about Azure Functions pricing models - comparing the original Consumption plan with the newer Flex Consumption plan, including detailed pricing breakdowns and examples

May 16, 2025

How much is AWS Lambda?

Learn how AWS Lambda pricing works, including charges for compute time (per GB-second) and number of requests, with examples and cost breakdowns.

March 31, 2025

6 Best Code Embedding Models Compared: A Complete Guide

Compare the top code embedding models for semantic code search, code completion, and repository analysis

March 31, 2025

8 Top Open-Source OCR Models Compared: A Complete Guide

Compare the best open-source OCR models for document processing, including traditional ML and LLM-based approaches

March 18, 2025

5 Best GPUs for Machine Learning in 2025: A Complete Guide

Compare the top GPUs for machine learning workloads, from traditional ML to large language models

March 15, 2025

How much is an Nvidia H200?

Learn about the cost of Nvidia H200 GPUs, how they compare to H100s, and explore top GPU-on-demand platforms for accessing this cutting-edge hardware.

March 12, 2025

5 Ways to Speed Up Whisper Transcription

Learn five proven strategies to accelerate Whisper transcription using GPU acceleration, model optimization, and parallel processing.

March 10, 2025

Best open-source LLMs in 2025

Overview of the best open-source llms

February 24, 2025

How to deploy LiveKit Agents on Modal

Learn how to deploy LiveKit agents on Modal, a serverless cloud platform that simplifies running containerized workloads.

February 24, 2025

The easiest way to run a Docker image in the cloud

Have a Docker image? Run it in the cloud with Modal.

January 31, 2025

How to run DeepSeek-R1 Distilled Qwen-32B with vLLM on Modal

Example code for running DeepSeek-R1 Distilled Qwen-32B with vLLM on Modal

January 27, 2025

Best frameworks for fine-tuning LLMs in 2025

Axolotl vs. Unsloth vs. Torchtune

January 27, 2025

A10 vs. A100 vs. H100 - Which one should you choose?

Discover the best GPU for your AI workload: Compare A10, A100, and H100 performance, pricing, and use cases to make an informed decision.

January 27, 2025

Top embedding models on the MTEB leaderboard

Overview of the top ranking embedding models on the MTEB leaderboard

January 21, 2025

Flux.1-dev: Run a top text-to-image model on Modal

Example usage of the Flux.1-dev image generation model

January 21, 2025

How to deploy Llama 3.1 70B Instruct on Modal

Example code for Llama 3.1 70B Instruct LLM

January 21, 2025

How to run Llama 3.1 8B Instruct on Modal

Example code for Llama 3.1 8B Instruct LLM

January 21, 2025

How to run Nomic Embed V1.5 on Modal

Example code for running Nomic Embed V1.5

January 21, 2025

How to deploy Stable Diffusion 3.5 Large on Modal

Example code for running Stable Diffusion 3.5 Large Turbo

January 21, 2025

How to run Stable Diffusion 3.5 Medium on Modal

Example code for running Stable Diffusion 3.5 Medium

January 21, 2025

How to run Stable Diffusion XL on Modal

Example code for running Stable Diffusion XL

January 21, 2025

How to deploy Whisper Large V3 on Modal

Example code for Whisper Large V3 speech recognition model

January 21, 2025

How to run WhisperX on Modal

Example usage of the WhisperX transcription model

December 20, 2024

Fine-tuning a FLUX.1-dev style LoRA

How we fine-tuned FLUX.1-dev for style on the Heroicons library.

November 26, 2024

Create a custom video generator by fine-tuning a Mochi LoRA on Modal

Customize Genmo's open state-of-the-art video generation model Mochi 1 on Modal infrastructure

November 2, 2024

Stable Diffusion 3.5 vs. Flux

Which text-to-image model is right for you?

October 31, 2024

How much is an Nvidia A100?

Learn about the cost of Nvidia A100 GPUs and explore top GPU-on-demand platforms for accessing this powerful hardware.

October 30, 2024

Top embedding models for RAG

Learn how to select an embedding model for your RAG system

October 30, 2024

Top image segmentation models

Learn which models to use to segment out objects in images and videos

October 30, 2024

Top open-source text-to-video AI models

Learn about the top open-source text-to-video AI models

October 17, 2024

What is Flux Dev?

Learn about the most popular text-to-image diffusion model on the market

October 16, 2024

What is Flash Attention?

Learn how to speed up your model training and inference with Flash Attention

October 15, 2024

Glossary: LLM fine-tuning hyperparameters

Confused about what each hyperparameter means when you're doing LLM fine-tuning? Our glossary will help.

October 15, 2024

Fine-tuning vs. RAG: Which approach is right for your use case?

Learn about the differences between fine-tuning and Retrieval Augmented Generation (RAG) for tailoring LLMs to your custom datasets, and discover which approach best suits your specific needs.

October 15, 2024

Build interactive workflows using Kestra and Modal

Learn how to create interactive workflows that dynamically adapt to user inputs with Kestra’s open-source orchestration platform and Modal’s serverless infrastructure.

October 15, 2024

vLLM vs. TGI

Learn how to speed up your model training and inference with vLLM or TGI

September 27, 2024

Top 5 serverless GPU providers

Learn all about top serverless GPU providers

September 25, 2024

AWS Lambda vs. Google Cloud functions: a comprehensive comparison

How do AWS Lambda and Google Cloud Functions compare? This article provides a detailed comparison of these two popular serverless execution environments.

September 25, 2024

Dagster vs. Airflow: a comprehensive comparison

An in-depth look at the differences between Dagster and Airflow for data orchestration

September 25, 2024

Google Cloud Run functions pricing: understanding costs and optimization

A comprehensive guide to the pricing model for Google Cloud Run functions, including differences between 1st and 2nd gen, CPU and memory allocation, and key pricing metrics. Learn how to optimize your serverless costs.

September 25, 2024

Google Cloud Run vs. Cloud Run Functions: understanding Google's serverless offerings

Explore the relationship between Google Cloud Run and Cloud Run Functions, their key differences, and how to choose the right serverless option for your needs.

September 25, 2024

RabbitMQ vs. Kafka: choosing the right messaging system

Learn about the key differences between RabbitMQ and Apache Kafka, their use cases, and how to choose the right messaging system for your needs.

September 25, 2024

Best practices for serverless inference

Learn about gotchas and best practices for serverless inference

September 23, 2024

Open-source AI agents

A roundup of popular open-source AI agents like OpenHands (formerly OpenDevin), SWE-agent, and Devika.

September 18, 2024

How to run Llama 3.1 as an API

Serve Meta's foundational Llama 3.1 models via API