GPU Glossary
GPU Glossary
/device-software/shared-memory

What is Shared Memory?

Shared memory is the level of the memory hierarchy corresponding to the thread block level of the thread hierarchy in the CUDA programming model . It is generally expected to be much smaller but much faster (in throughput and latency) than the global memory .

A fairly typical kernel therefore looks something like this:

Shared memory is stored in the L1 data cache of the GPU's Streaming Multiprocessor (SM) .

Modal LogoBuilding on GPUs? We know a thing or two about it.

Modal is an ergonomic Python SDK wrapped around a global GPU fleet. Deploy serverless AI workloads instantly without worrying about quota requests, driver compatibility issues, or managing bulky ML dependencies.

Deploy on GPUs
Something seem wrong?
Or want to contribute?

Click this button to
let us know on GitHub.