GPU Glossary
GPU Glossary
/device-software/thread

What is a CUDA Thread?

A thread of execution (or "thread" for short) is the lowest unit of programming for GPUs, the base and atom of the CUDA programming model 's thread hierarchy . A thread has its own registers , but little else.

Both SASS and PTX programs target threads. Compare this to a typical C program in a POSIX environment, which targets a process, itself a collection of one or more threads. Unlike POSIX threads, CUDA threads are not used to make syscalls.

Like a thread on a CPU, a GPU thread can have a private instruction pointer/program counter. However, for performance reasons, GPU programs are generally written so that all the threads in a warp share the same instruction pointer, executing instructions in lock-step (see also Warp Scheduler ).

Also like threads on CPUs, GPU threads have stacks in global memory for storing spilled registers and a function call stack, but high-performance kernels generally limit use of both.

A single CUDA Core executes instructions from a single thread.

Something seem wrong?
Or want to contribute?

Click this button to
let us know on GitHub.