Startups get up to $50k in free compute credits.
May 20, 202510 minute read
Modal's Serverless KV Store Gets Its Limit Raised to Infinity
author
Daniel Shaar@dshaar_
Member of Technical Staff

Modal’s Dict primitive provides users with a simple TTL’ed (time-to-live) key-value store that can be accessed from any container within the same workspace environment. Dicts are well-suited for things like caching the results of function calls and communicating state changes among a fleet of containers.

Today, we’re excited to announce some major improvements to Dicts, including smarter caching, a new locking feature, and data durability!

🧑‍🚀 A few small changes to Dict, a giant leap for Dict use

Here’s what we’ve changed:

Legacy DictsNew Dicts
Storage limit10GiBUnlimited
Item expiry policy30 days since last write7 days since last write OR read
Locking primitiveN/A.put() now supports a skip_if_exists flag
Durability

These changes will apply to all newly created Dicts. Some cool things we think these features enable:

  • LRU-like caching: now that reading extends an item’s TTL, hot cache entries will stick around for as long as they’re needed. And with unlimited items, there’s no need to worry about evicting useful data.
  • Distributed locking: in the event that many containers try to perform a redundant operation or state change, you can guarantee “exactly once” semantics using skip_if_exists.

With these new properties, let’s see how we can better tackle a common use case for Dicts: reducing backend load by caching function call results.

🧱 Building a request cache, Dict by Dict

Let’s look at a common app structure without Dicts that we may want to build and optimize on Modal. In this example, we have an “expensive” function that takes a while to run, along with a high concurrency web endpoint that simply calls out to the function.

@app.function()
def expensive_function(x: int) -> int:
    time.sleep(30)
    return x ** 2

@app.function(image=modal.Image.debian_slim().pip_install("fastapi[standard]"))
@modal.concurrent(max_inputs=100)
@modal.fastapi_endpoint()
def expensive_function_endpoint(x: int) -> int:
    expensive_function_modal = modal.Function.from_name(APP_NAME, "expensive_function")
    return expensive_function_modal.remote(x)

After running this app in production for a while, we discover that users are issuing the same few requests to the web endpoint—who knew that figuring out 13 squared is 169 is all the rage. Not only that, but our app typically sees bursts of traffic for these hot requests.

As was commonly done by Modal users with the previous version of Dicts, we can define some sort of request caching class to wrap our function calls. A sample interface could look like:

class RequestCacher():
    """Utility class using `modal.Dict` to issue deduped requests and cache the results."""

    def __init__(self, function: modal.Function):
        self.function = function.hydrate()
        self.cache = modal.Dict.from_name(f"{function.object_id}-cache", create_if_missing=True)

    def _fetch_cached_result(self, request_id: bytes) -> Any:
        pass  # Used by `.call()`.

    def call(self, request_id: bytes, *args, **kwargs) -> Any:
        pass

We go ahead and implement some straightforward caching logic—check the cache, if the entry isn’t there, then make the call ourselves and add it to the cache when we’re done. Problem solved!

Or so we thought… turns out those bursts of traffic contain many requests that come in all at once, so we still end up making a bunch of expensive requests to our backend code before the cache is populated. We could work hard to narrow down that race condition window and handle the edge cases. But really, wouldn’t it be great if we could guarantee that we only make the one call to our expensive function?

With Dicts, we can make something quite snazzy to do just this!

🔒 Deduping requests—pop it, lock it

Diagram of request cacher

Glossing over how a production version of this (that we hope to release in our client 👀) would handle various failure modes, the request handling logic now looks like:

  • Try to “acquire a lock” by putting a pending entry in the Dict if it doesn’t already exist.
  • If someone else is / was working on the request, we read the Dict entry and poll for / fetch the result.
  • Otherwise, assuming we successfully wrote the pending entry:
    • We .spawn() a function call and insert its handle as an in_progress Dict entry.
    • Once the function call is complete, we insert the result as a completed Dict entry.

Here’s a sample implementation:

    def call(self, request_id: bytes, *args, **kwargs) -> Any:
        pending = _CacheEntry(_RequestState.PENDING, time.time())
        if not self.cache.put(request_id, pending, skip_if_exists=True):
            # This request is already being worked on or done.
            return self._fetch_cached_result(request_id)

        # Issue the request and populate the cache with the function call handle.
        function_call = self.function.spawn(*args, **kwargs)
        in_progress = _CacheEntry(_RequestState.IN_PROGRESS, function_call)
        self.cache.put(request_id, in_progress)

        # Once the function call completes, populate the cache with the result.
        result = function_call.get()
        completed = _CacheEntry(_RequestState.COMPLETED, result)
        self.cache.put(request_id, completed)
        return result

The proof is in the dashboard, so here I’ve issued 3 identical requests to our web endpoint. The second request was deduped against the first, and the third request just got the result from cache:

Screenshot of not-so-expensive-function-endpoint dashboard

This not only speeds up our customer experience significantly, but we also ended up calling the expensive function only once—success!

Screenshot of expensive-function dashboard

💸 Shut up and give me my Dicts

Whether it’s caching, locking, or some other state management, just create a new Dict to get started! For more details, check out our docs. Caveat: to use the skip_if_exists flag, you may need to upgrade your client version.

Got questions? Come hang out in our community Slack—we’d love to hear what you’re building.

Ship your first app in minutes.

Get Started

$30 / month free compute