Modal has a few different tools that helps with increasing performance of your applications.
Parallel execution of inputs
If your code is running the same function repeatedly with different independent
inputs (e.g., a grid search), the easiest way to increase performance is to run
those function calls in parallel using Modal’s
Here is an example if we had a function evaluate_model that takes a single argument:
import modal stub = modal.Stub() def evaluate_model(x): ... def main(): inputs = list(range(100)) for result in evaluate_model.map(inputs): # runs many inputs in parallel ...
In this example,
evaluate_model will be called with each of the 100 inputs
(the numbers 0 - 99 in this case) roughly in parallel and the results are
returned as an iterable with the results ordered in the same way as the inputs.
Out of order results and flatmap
.map() can also be used on Modal functions that wrap
generator will be created per input in the map, and each output from the
generator will then be returned as they are created. This means the outputs will
not necessarily come in the same order as the inputs. Since a generator can
yield zero or more results, the number of outputs will not necessarily match the
number of inputs either, like a “flat map”.
By default, if any of the function calls raises an exception, the exception will
be propagated. To treat exceptions as successful results and aggregate them in
the results list, pass in
def my_func(a): if a == 2: raise Exception("ohno") return a ** 2 def main(): print(list(my_func.map(range(3), return_exceptions=True))) # [0, 1, UserCodeException(Exception('ohno'))]
If your function takes multiple variable arguments, you can either use
Function.map() with one input iterator
per argument, or
with a single input iterator containing sequences (like tuples) that can be
spread over the arguments. This works similarly to Python’s built in
def my_func(a, b): return a + b def main(): assert list(my_func.starmap([(1, 2), (3, 4)])) == [3, 7]
.map() is a method on the modal function object itself, so you don’t
explicitly call the function.
results = evaluate_model(inputs).map()
Modal’s map is also not the same as using Python’s builtin
map(). While the
following will technically work, it will execute all inputs in sequence rather
than in parallel.
results = map(evaluate_model, inputs)
All Modal APIs are available in both blocking and asynchronous variants. If you are comfortable with asynchronous programming, you can use it to create arbitrary parallel execution patterns, with the added benefit that any Modal functions will be executed remotely. See the async guide or the examples for more information about asynchronous usage.
Sometimes you can speed up your applications by utilizing GPU acceleration. See the gpu section for more information.
If you want to limit concurrency, you can use the
stub.function. For instance:
stub = modal.Stub() def f(x): print(x)
With this, Modal will run at most 5 concurrent functions at any point.