GPU Glossary
/device-software/shared-memory

Shared Memory

Shared memory is the abstract memory associated with the thread block level (left, center) of the CUDA thread group hierarchy (left). Modified from diagrams in NVIDIA's CUDA Refresher: The CUDA Programming Model and the NVIDIA CUDA C++ Programming Guide .

Shared memory is the level of the memory hierarchy corresponding to the thread block level of the thread group hierarchy in the CUDA programming model . It is generally expected to be much smaller but much faster (in throughput and latency) than the global memory .

A fairly typical kernel therefore looks something like this:

Shared memory is stored in the L1 data cache of the GPU's Streaming Multiprocessor (SM) .

Something seem wrong?
Or want to contribute?
Email: glossary@modal.com