GPU RAM
The global memory of the GPU is a large (many megabytes to gigabytes) memory store that is addressable by all of the GPU's Streaming Multiprocessors (SMs) .
It is also known as GPU RAM (random access memory) or video RAM (VRAM). It uses Dynamic RAM (DRAM) cells, which are slower but smaller than the Static RAM (SRAM) used in registers and shared memory. For details on DRAM and SRAM, we recommend Ulrich Drepper's 2007 article "What Every Programmer Should Know About Memory" .
It is generally not on the same die as the SMs , though in the latest data center-grade GPUs like the H100, it is located on a shared interposer for decreased latency and increased bandwidth (aka "high-bandwidth memory ").
RAM is used to implement the global memory of the CUDA programming model and to store register data that spills from the register file .
An H100 can store 80 GiB (687,194,767,360 bits) in its RAM.