GPU Glossary
/device-software/global-memory

Global Memory

Global memory is the highest level of the memory hierarchy in the CUDA programming model . It is stored in the GPU RAM . Modified from diagrams in NVIDIA's CUDA Refresher: The CUDA Programming Model and the NVIDIA CUDA C++ Programming Guide .

As part of the CUDA programming model , each level of the thread group hierarchy has access to matching memory from the memory hierarchy . This memory can be used for coordination and communication and is managed by the programmer (not the hardware or a runtime).

The highest level of that memory hierarchy is the global memory. Global memory is global in its scope and its lifetime. That is, it is accessible by every thread in a thread block grid and its lifetime is as long as the execution of the program.

Access to data structures in the global memory can be synchronized across all accessors using atomic instructions, as with CPU memory. Within a cooperative thread array , access can be more tightly synchronized, e.g. with barriers.

This level of the memory hierarchy is typically implemented in the GPU's RAM and allocated from the host using a memory allocator provided by the CUDA Driver API or the CUDA Runtime API .

Something seem wrong?
Or want to contribute?
Email: glossary@modal.com