Shared volumes
Modal lets you create writeable volumes that can be simultaneously attached to multiple Modal functions. These are helpful for use cases such as:
- Caching model checkpoints
- Storing datasets
- Keeping a shared cache for expensive computations
Basic example
The modal.SharedVolume constructor
initializes an empty volume. This can be mounted within a function by providing
a mapping between mount paths and SharedVolume
objects. For example, to use a
SharedVolume
to initialize a shared
shelve disk cache:
import shelve
import modal
volume = modal.SharedVolume()
@stub.function(shared_volumes={"/root/cache": volume})
def expensive_computation(key: str):
with shelve.open("/root/cache/shelve") as cache:
cached_val = cache.get(key)
if cached_val is not None:
return cached_val
# cache miss; populate value
...
The above implements basic disk caching, but be aware that shelve
does not
guarantee correctness
in the event of concurrent read/write operations. To protect against concurrent
write conflicts, the flufl.lock
package is useful. An example of that library’s usage is in the
Datasette example.
Persisting volumes
By default, a modal.SharedVolume lives as
long as the app it’s defined in, just like any other Modal object. However in
many situations you might want to persist the cache between runs of the app. To
do this, you can use the persist
method on the SharedVolume
object. For
example, to avoid re-downloading a HuggingFace model checkpoint every time you
run your function:
import modal
volume = modal.SharedVolume().persist("model-cache-vol")
stub = modal.Stub()
CACHE_DIR = "/cache"
@stub.function(
shared_volumes={CACHE_DIR: volume},
# Set the transformers cache directory to the volume we created above.
# For details, see https://huggingface.co/transformers/v4.0.1/installation.html#caching-models
secret=modal.Secret.from_dict({"TRANSFORMERS_CACHE": CACHE_DIR})
)
def run_inference():
...
Deleting volumes
To remove a persisted shared volume, deleting all its data, you must stop the volume. This can be done via the volume’s dashboard app page or the CLI.
For example, a volume with the name my-vol
that lives in the e-corp
workspace could be stopped (i.e. deleted) by going to its dashboard page at
https://modal.com/apps/e-corp/my-vol and clicking the trash icon. Alternatively,
you can use the volume’s app ID with
modal app stop
.
(Volumes are currently a specialized app type within Modal, which is why deleting a volume is done by stopping an app.)
Further examples
- The Modal Podcast Transcribe uses a persisted volume to durably store raw audio, metadata, and finished transcriptions.
- News Article Summarizer uses a persisted volume to store pretrained models.