Concurrency and rate limits

Modal has two mechanisms for limiting function execution.

Concurrency limit

You can limit the number of concurrent executions for a function:

def my_concurrency_limited_function():

This restricts the maximum number of Modal containers that can run simultaneously for this function. Each Modal container handles a single input at a time.

Rate limit

You can also set a rate limit on a Modal function, which limits how many times a function will execute in a given time period. The rate limit counter is reset at the beginning of each time period. So for example, if you set a rate limit of 10 requests per minute, you can successfully call the function 10 times at 18:00:59 and then 10 times again at 18:01:00.

Currently, per_second and per_minute are the two interval lengths supported:

def per_second_limit():

def per_minute_limit():