Modal Code Playground

Modal makes it simple and fast to run embarrassingly parallel jobs across hundreds of containers. In this tutorial, we’ll show you how to scale out a basic grid search in one dimension using Modal’s Function.map().

Let’s say we have a Modal Function fit_knn that uses one of sklearn’s built-in datasets and trains a k-nearest neighbors classifier on it. Currently, we run fit_knn with each value of k (number of nearest neighbors) sequentially.

We want to make this more efficient by calling fit_knn with each value of k in parallel.

Your turn: Parallelize this function

.map is a Function method that runs function calls in parallel, and returns the results as an iterable with the results ordered in the same way as the inputs.

Use .map to run a distributed hyperparameter search with fit_knnfor different values of k:

@app.local_entrypoint()
def main():
    results = [] 
    for k in range(1, 100):  
        results.append(fit_knn.remote(k))  
    results = fit_knn.map(range(1, 100))  
    best_score, best_k = max(results)
    print("Best k = %3d, score = %.4f" % (best_k, best_score))

Once your program started running, you should see that 99 containers running fit_knn are launched roughly in parallel. And all of this happens lightning fast!

Hyperparameter tuning is just one example of a workload you can parallelize effortlessly with Modal. Imagine how easy it is to scale any application from web scraping to batch processing, and even combining with GPU acceleration to run efficient distributed training experiments and evaluation sweeps!


$ modal run scaling_out.py