Cloud bucket mounts

The modal.CloudBucketMount is a mutable volume that allows for both reading and writing files from a cloud bucket. It supports AWS S3, Cloudflare R2, and Google Cloud Storage buckets.

Cloud bucket mounts are built on top of AWS’ mountpoint technology and inherits its limitations.

Mounting Cloudflare R2 buckets

CloudBucketMount enables Cloudflare R2 buckets to be mounted as file system volumes. Because Cloudflare R2 is S3-Compatible the setup is very similar between R2 and S3. See modal.CloudBucketMount for usage instructions.

When creating the R2 API token for use with the mount, you need to have the ability to read, write, and list objects in the specific buckets you will mount. You do not need admin permissions, and you should not use “Client IP Address Filtering”.

Mounting S3 buckets

CloudBucketMount enables S3 buckets to be mounted as file system volumes. To interact with a bucket, you must have the appropriate IAM permissions configured (refer to the section on IAM Permissions).

import modal
import subprocess

app = modal.App()  # Note: prior to April 2024, "app" was called "stub"

s3_bucket_name = "s3-bucket-name"  # Bucket name not ARN.
s3_access_credentials = modal.Secret.from_dict({
    "AWS_ACCESS_KEY_ID": "...",
    "AWS_SECRET_ACCESS_KEY": "...",
    "AWS_REGION": "..."
})

@app.function(
    volumes={
        "/my-mount": modal.CloudBucketMount(s3_bucket_name, secret=s3_access_credentials)
    }
)
def f():
    subprocess.run(["ls", "/my-mount"])

Specifying S3 bucket region

Amazon S3 buckets are associated with a single AWS Region. Mountpoint attempts to automatically detect the region for your S3 bucket at startup time and directs all S3 requests to that region. However, in certain scenarios, like if your container is running on an AWS worker in a certain region, while your bucket is in a different region, this automatic detection may fail.

To avoid this issue, you can specify the region of your S3 bucket by adding an AWS_REGION key to your Modal secrets, as in the code example above.

Read-only mode

To mount a bucket in read-only mode, set read_only=True as an argument.

import modal
import subprocess

app = modal.App()  # Note: prior to April 2024, "app" was called "stub"

s3_bucket_name = "s3-bucket-name"  # Bucket name not ARN.
s3_access_credentials = modal.Secret.from_dict({
    "AWS_ACCESS_KEY_ID": "...",
    "AWS_SECRET_ACCESS_KEY": "...",
})

@app.function(
    volumes={
        "/my-mount": modal.CloudBucketMount(s3_bucket_name, secret=s3_access_credentials, read_only=True)
    }
)
def f():
    subprocess.run(["ls", "/my-mount"])

While S3 mounts supports both write and read operations, they are optimized for reading large files sequentially. Certain file operations, such as renaming files, are not supported. For a comprehensive list of supported operations, consult the Mountpoint documentation.

IAM permissions

To utilize CloudBucketMount for reading and writing files from S3 buckets, your IAM policy must include permissions for s3:PutObject, s3:AbortMultipartUpload, and s3:DeleteObject. These permissions are not required for mounts configured with read_only=True.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ModalBucketAccess",
      "Effect": "Allow",
      "Action": ["s3:ListBucket"],
      "Resource": ["arn:aws:s3:::<MY-S3-BUCKET>"]
    },
    {
      "Sid": "ModalBucketAccess",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:AbortMultipartUpload",
        "s3:DeleteObject"
      ],
      "Resource": ["arn:aws:s3:::<MY-S3-BUCKET>/*"]
    }
  ]
}

Known issues

Overwriting existing files will currently fail with PermissionError: [Errno 1] Operation not permitted. A workaround is to check for file existence and delete before writing:

def overwrite_file(filepath: pathlib.Path, data: bytes):
    if filepath.exists():
        filepath.unlink()
    filepath.write_bytes(data)