Modal has a few different tools that helps with increasing performance of your applications.
Parallel execution of inputs
If your code is running the same function repeatedly with different independent
inputs (e.g., a grid search), the easiest way to increase performance is to run
those function calls in parallel using Modal’s
Here is an example if we had a function evaluate_model that takes a single argument:
import modal stub = modal.Stub() def evaluate_model(x): ... if __name__ == "__main__": with stub.run(): inputs = list(range(100)) for result in evaluate_model.map(inputs): # runs many inputs in parallel ...
In this example,
evaluate_model will be called with each of the 100 inputs
(the numbers 0 - 99 in this case) roughly in parallel and the results are
returned as an iterable with the results ordered in the same way as the inputs.
Out of order results and flatmap
Besides Modal functions, you can also use
.map() on a Modal
stub.generator instead of
stub.function). Each output from the
generators (one generator will be created per input) will then be returned as
they are created. This means the outputs will not necessarily come in the same
order as the inputs. Since a generator can yield zero or more results, the
number of outputs will not necessarily match the number of inputs either, like a
If your function takes multiple variable arguments, you can either use
Function.map() with one input iterator per argument, or
with a single input iterator containing sequences (like tuples) that can be
spread over the arguments. This works similarly to Python’s built in
.map() is a method on the modal function object itself, so you don’t
explicitly call the function.
results = evaluate_model(inputs).map()
Modal’s map is also not the same as using Python’s builtin
map(). While the
following will technically work, it will execute all inputs in sequence rather
than in parallel.
results = map(evaluate_model, inputs)
All Modal APIs are available in both blocking and asynchronous variants. If you are comfortable with asynchronous programming, you can use it to create arbitrary parallel execution patterns, with the added benefit that any Modal functions will be executed remotely. See the async guide or the examples for more information about asynchronous usage.
Sometimes you can speed up your applications by utilizing GPU acceleration. See the gpu section for more information.
If you want to limit concurrency, you can use the
stub.function. For instance:
stub = modal.Stub() def f(x): print(x)
With this, Modal will run at most 5 concurrent functions at any point.