modal.Cls

class Cls(modal.object.Object)

Cls adds method pooling and lifecycle hook behavior to modal.Function.

Generally, you will not construct a Cls directly. Instead, use the @app.cls() decorator on the App object.

hydrate 

hydrate(self, client=None)

Synchronize the local object with its identity on the Modal server.

It is rarely necessary to call this method explicitly, as most operations will lazily hydrate when needed. The main use case is when you need to access object metadata, such as its ID.

Added in v0.72.39: This method replaces the deprecated .resolve() method.

from_name 

from_name(cls, app_name, name, *, version=None, environment_name=None,
    client=None)

Reference a Cls from a deployed App by its name.

This is a lazy method that defers hydrating the local object with metadata from Modal servers until the first time it is actually used.

Parameters

app_name str
Name of the deployed App that defines this class.
name str
Object tag of the Cls within that App.
environment_name str | None
Workspace environment for the lookup; defaults to the active environment.
client "_Client | None"
Optional Modal client; defaults to the process client.

Returns

A Cls reference that hydrates on first use.

Usage

Model = modal.Cls.from_name("other-app", "Model")

The version parameter constructs a version-pinned Cls:

Modelv3 = modal.Cls.from_name("other-app", "Model", version=3)

with_options 

with_options(self, *, cpu=None, memory=None, gpu=None, env=None, secrets=None,
    volumes={}, retries=None, max_containers=None, buffer_containers=None,
    scaledown_window=None, timeout=None, region=None, cloud=None)

Override the static Cls configuration with invocation-specific values.

This method will return a new variant of the Cls that will autoscale independently of the base configuration.

Note that options cannot be “unset” with this method (i.e., if a GPU is configured in the @app.cls() decorator, passing gpu=None here will not create a CPU-only instance).

Container arguments (volumes and secrets) from later calls replace earlier values; they are not merged.

Parameters

cpu float | tuple[float, float] | None
CPU cores for instances created from this Cls (see @app.function / @app.cls resource options).
memory int | tuple[int, int] | None
Memory in MiB, or min/max pair, for those instances.
gpu str | None
GPU type string, for example A100.
env dict[str, str | None] | None
Environment variables merged into a temporary secret for this configuration.
secrets Collection[_Secret] | None
Additional secrets attached to the service function.
volumes dict[str | PurePosixPath, _Volume | _CloudBucketMount]
Volume and cloud-bucket mounts (paths to Volume or CloudBucketMount). (Default is {})
retries int | Retries | None
Retry policy or count for invocations.
max_containers int | None
Cap on concurrently running containers for this Cls configuration.
buffer_containers int | None
Extra idle containers kept warm while the Function is active.
scaledown_window int | None
Seconds a container may stay idle before scaling down.
timeout int | None
Function timeout in seconds.
region str | Sequence[str] | None
One region or a list of regions to schedule on.
cloud str | None
Cloud provider (for example aws, gcp, oci, or auto).

Returns

A new Cls with the merged options.

Usage

You can use this method after looking up the Cls from a deployed App or if you have a direct reference to a Cls from another Function or local entrypoint on its App:

Model = modal.Cls.from_name("my_app", "Model")
ModelUsingGPU = Model.with_options(gpu="A100")
ModelUsingGPU().generate.remote(input_prompt)  # Run with an A100 GPU

The method can be called multiple times to “stack” updates:

Model.with_options(gpu="A100").with_options(scaledown_window=300)  # Use an A100 with slow scaledown

with_concurrency 

with_concurrency(self, *, max_inputs, target_inputs=None)

Override the static Cls configuration with invocation-specific input concurrency settings.

Parameters

max_inputs int
Maximum number of inputs processed concurrently per container.
target_inputs int | None
Optional target concurrency; see @app.cls / Function concurrency docs.

Returns

A new Cls with the merged concurrency settings.

Usage

Model = modal.Cls.from_name("my_app", "Model")
ModelUsingGPU = Model.with_options(gpu="A100").with_concurrency(max_inputs=100)
ModelUsingGPU().generate.remote(42)  # will run on an A100 GPU with input concurrency enabled

with_batching 

with_batching(self, *, max_batch_size, wait_ms)

Override the static Cls configuration with invocation-specific dynamic batching settings.

Parameters

max_batch_size int
Maximum batch size for dynamic batching.
wait_ms int
Maximum time to wait to fill a batch, in milliseconds.

Returns

A new Cls with the merged batching settings.

Usage

Model = modal.Cls.from_name("my_app", "Model")
ModelUsingGPU = Model.with_options(gpu="A100").with_batching(max_batch_size=100, wait_ms=1000)
ModelUsingGPU().generate.remote(42)  # A100 with dynamic batching