Or want to contribute?
Click this button to
let us know on GitHub.
What is a Texture Processing Cluster?
A Texture Processing Cluster (TPC) is a pair of adjacent Streaming Multiprocessors (SMs) .
Before the Blackwell SM architecture , TPCs were not mapped onto any level of the CUDA programming model 's memory hierarchy or thread hierarchy .
The fifth-generation Tensor Cores
in the Blackwell
SM architecture
added the "CTA pair" level of the
Parallel Thread eXecution (PTX)
thread hierarchy , which maps
onto TPCs. Many tcgen05
PTX instructions
include a .cta_group field that can use a single
SM (.cta_group::1)
or a pair of SMs in a
TPC (::2), which are mapped to 1SM and 2SM variants of
Streaming Assembler (SASS)
instructions like MMA.
Building on GPUs? We know a thing or two about it.
Modal is an ergonomic Python SDK wrapped around a global GPU fleet. Deploy serverless AI workloads instantly without worrying about quota requests, driver compatibility issues, or managing bulky ML dependencies.