Run a job queue that turns documents into structured data with Datalab Marker

This tutorial shows you how to use Modal as an infinitely scalable job queue that can service async tasks from a web app.

Our job queue will handle a single task: converting images/PDFs into structured data. We’ll use Marker from Datalab, which can convert images of documents or PDFs to Markdown, JSON, and HTML. Marker is an open-weights model; to learn more about commercial usage, see here.

For the purpose of this tutorial, we’ve also built a React + FastAPI web app on Modal that works together with it, but note that you don’t need a web app running on Modal to use this pattern. You can submit async tasks to Modal from any Python application (for example, a regular Django app running on Kubernetes).

Try it out for yourself here.

Define an App 

Let’s first import modal and define an App. Later, we’ll use the name provided for our job queue App to find it from our web app and submit tasks to it.

We also define the dependencies we need by specifying an Image.

Cache the pre-trained model on a Modal Volume 

We can obtain the pre-trained model we want to run from Datalab by using the Marker library.

The create_model_dict function downloads model weights from Datalab’s cloud storage (S3 bucket) if they aren’t already present in the filesystem. However, in Modal’s serverless environment, filesystems are ephemeral, so using this code alone would mean that models need to be downloaded many times (every time a new instance of our Function spins up).

So instead, we create a Modal Volume to store the models. Each Modal Volume is a durable filesystem that any Modal Function can access. You can read more about storing model weights on Modal in our guide.

Run Datalab Marker on Modal 

Now let’s set up the actual inference.

Using the @app.function decorator, we set up a Modal Function. We provide arguments to that decorator to customize the hardware, scaling, and other features of the Function.

Here, we say that this Function should use NVIDIA L40S GPUs, automatically retry failures up to 3 times, and have access to our shared model cache.

Inside the Function, we write out our inference logic, which mostly involves configuring components provided by the marker library.

Testing and debugging remote code 

To make sure this code works, we want a way to kick the tires and debug it.

We can run it on Modal, with no need to set up separate local testing, by adding a local_entrypoint that invokes the Function .remotely.

You can then run this from the command line with:

Deploying the document conversion service 

Now that we have a Function, we can publish it by deploying the App:

Once it’s published, we can look up this Function from another Python process and submit tasks to it:

Modal will auto-scale to handle all the tasks queued, and then scale back down to 0 when there’s no work left. To see how you could use this from a Python web app, take a look at the receipt parser frontend tutorial.