Scaling out
Modal has a few different tools that helps with increasing performance of your applications.
Parallel execution of inputs
If your code is running the same function repeatedly with different independent
inputs (e.g., a grid search), the easiest way to increase performance is to run
those function calls in parallel using Modal’s
Function.map()
method.
Here is an example if we had a function evaluate_model that takes a single argument:
import modal
app = modal.App()
@app.function()
def evaluate_model(x):
...
@app.local_entrypoint()
def main():
inputs = list(range(100))
for result in evaluate_model.map(inputs): # runs many inputs in parallel
...
In this example, evaluate_model
will be called with each of the 100 inputs
(the numbers 0 - 99 in this case) roughly in parallel and the results are
returned as an iterable with the results ordered in the same way as the inputs.
Exceptions
By default, if any of the function calls raises an exception, the exception will
be propagated. To treat exceptions as successful results and aggregate them in
the results list, pass in
return_exceptions=True
.
@app.function()
def my_func(a):
if a == 2:
raise Exception("ohno")
return a ** 2
@app.local_entrypoint()
def main():
print(list(my_func.map(range(3), return_exceptions=True)))
# [0, 1, UserCodeException(Exception('ohno'))]
Starmap
If your function takes multiple variable arguments, you can either use
Function.map()
with one input iterator
per argument, or Function.starmap()
with a single input iterator containing sequences (like tuples) that can be
spread over the arguments. This works similarly to Python’s built in map
and
itertools.starmap
.
@app.function()
def my_func(a, b):
return a + b
@app.local_entrypoint()
def main():
assert list(my_func.starmap([(1, 2), (3, 4)])) == [3, 7]
Gotchas
Note that .map()
is a method on the modal function object itself, so you don’t
explicitly call the function.
Incorrect usage:
results = evaluate_model(inputs).map()
Modal’s map is also not the same as using Python’s builtin map()
. While the
following will technically work, it will execute all inputs in sequence rather
than in parallel.
Incorrect usage:
results = map(evaluate_model, inputs)
Asynchronous usage
All Modal APIs are available in both blocking and asynchronous variants. If you are comfortable with asynchronous programming, you can use it to create arbitrary parallel execution patterns, with the added benefit that any Modal functions will be executed remotely. See the async guide or the examples for more information about asynchronous usage.
GPU acceleration
Sometimes you can speed up your applications by utilizing GPU acceleration. See the gpu section for more information.
Limiting concurrency
If you want to limit concurrency, you can use the concurrency_limit
argument
to app.function
. For instance:
app = modal.App()
@app.function(concurrency_limit=5)
def f(x):
print(x)
With this, Modal will spin up at most 5 containers at any point.