Screenshot with Chromium

In this example, we use Modal functions and the playwright package to take screenshots of websites from a list of URLs in parallel.

You can run this example on the command line with

modal run 02_building_containers/screenshot.py --url 'https://www.youtube.com/watch?v=dQw4w9WgXcQ'

This should take a few seconds then create a /tmp/screenshots/screenshot.png file, shown below.

screenshot

Setup

First we import the Modal client library.

import pathlib

import modal

stub = modal.Stub("example-screenshot")

Define a custom image

We need an image with the playwright Python package as well as its chromium plugin pre-installed. This requires intalling a few Debian packages, as well as setting up a new Debian repository. Modal lets you run arbitrary commands, just like in Docker:

image = modal.Image.debian_slim().run_commands(
    "apt-get install -y software-properties-common",
    "apt-add-repository non-free",
    "apt-add-repository contrib",
    "apt-get update",
    "pip install playwright==1.20.0",
    "playwright install-deps chromium",
    "playwright install chromium",
)

The screenshot function

Next, the scraping function which runs headless Chromium, goes to a website, and takes a screenshot. This is a Modal function which runs inside the remote container.

@stub.function(image=image)
async def screenshot(url):
    from playwright.async_api import async_playwright

    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()
        await page.goto(url, wait_until="networkidle")
        await page.screenshot(path="screenshot.png")
        await browser.close()
        data = open("screenshot.png", "rb").read()
        print("Screenshot of size %d bytes" % len(data))
        return data

Entrypoint code

Let’s kick it off by reading a bunch of URLs from a txt file and scrape some of those.

@stub.local_entrypoint
def main(url: str = "https://modal.com"):
    filename = pathlib.Path("/tmp/screenshots/screenshot.png")
    data = screenshot.call(url)
    filename.parent.mkdir(exist_ok=True)
    with open(filename, "wb") as f:
        f.write(data)
    print(f"wrote {len(data)} bytes to {filename}")

And we’re done! Please also see our introductory guide for another example of a web scraper, with more in-depth logic.

Try this on Modal!

You can run this on Modal with 60 seconds of work!
Creating an account is free and no credit card is required. After creating an account, install the Modal Python package and create an API token:
pip install modal-client
modal token new
git clone https://github.com/modal-labs/modal-examples
cd modal-examples
modal run 02_building_containers/screenshot.py