Compute Capability
Instructions in the Parallel Thread Execution instruction set are compatible with only certain physical GPUs. The versioning system used to abstract away details of physical GPUs from the instruction set and compiler is called "Compute Capability".
Most compute capability version numbers have two components: a major version and a minor version. NVIDIA promises forward compatibility (old PTX code runs on new GPUs) across both major and minor versions following the onion layer model.
With Hopper, NVIDIA has introduced an additional version suffix, the a
in
9.0a
, which includes features that deviate from the onion model: their future
support is not guaranteed.
Target compute capabilities for
PTX compilation can
be specified when invoking nvcc
, the
NVIDIA CUDA Compiler Driver . By default, the
compiler will also generate optimized
SASS for the matching
Streaming Multiprocessor (SM) architecture .
The
documentation
for nvcc refers to compute capability as a
"virtual GPU architecture", in contrast to the "physical GPU architecture"
expressed by the SM
version.
The technical specifications for each compute capability version can be found in the Compute Capability section of the NVIDIA CUDA C Programming Guide .