GPU Glossary
/device-software/thread-block

What is a Thread Block?

Thread blocks are an intermediate level of the thread group hierarchy of the CUDA programming model (left). A thread block executes on a single Streaming Multiprocessor (right, middle). Modified from diagrams in NVIDIA's CUDA Refresher: The CUDA Programming Model and the NVIDIA CUDA C++ Programming Guide .

A thread block is a level of the CUDA programming model's thread hierarchy below a grid but above a warp . It is the CUDA programming model's abstract equivalent of the concrete cooperative thread arrays in PTX /SASS .

Blocks are the smallest unit of thread coordination exposed to programmers. Blocks must execute independently, so that any execution order for blocks is valid, from fully serial in any order to all interleavings.

A single CUDA kernel launch produces one or more thread blocks (in the form of a block grid ), each of which contains one or more warps . Blocks can be arbitrarily sized, but they are typically multiples of the warp size (32 on all current CUDA GPUs).

Something seem wrong?
Or want to contribute?
Email: glossary@modal.com