Or want to contribute?
Click this button to
let us know on GitHub.
What is a Thread Block Grid?
When a CUDA kernel is launched, it creates a collection of threads known as a thread block grid. Grids can be one, two, or three dimensional. They are made up of thread blocks .
The matching level of the memory hierarchy is the global memory .
Thread blocks are effectively independent units of computation. They execute concurrently, that is, with indeterminate order, ranging from fully sequentially in the case of a GPU with a single Streaming Multiprocessor to fully in parallel when run on a GPU with sufficient resources to run them all simultaneously.
Building on GPUs? We know a thing or two about it.
Modal is an ergonomic Python SDK wrapped around a global GPU fleet.Deploy serverless AI workloads instantly without worrying about quota requests, driver compatibility issues, or managing bulky ML dependencies.
Deploy serverless AI workloads instantly without worrying about quota requests, driver compatibility issues, or managing bulky ML dependencies.