CUDA学习(2)--编程模型

Key Abstraction of CUDA

  • Thread Hierarchy –> Divide thread into several blocks
  • Memory Hierarchy –> Local memory of threads, shared memory of blocks and global memory of grid
  • Heterogonous Programming –> Kernel executes on the device and the rest of C code executes on the host

Thread Hierarchy

Divide thread into several blocks. And every block is independent, which means they can both execute parallel or serial.

CUDA学习(2)--编程模型

Memory Hierarchy

The programming model of CUDA separate the memory of a program into three different part. And They are local memory of threads, shared memory of blocks and global memory of the grid.

CUDA学习(2)--编程模型

Heterogeneous Programming

Heterogeneous programming of CUDA means the kernels execute on a GPU and the rest of C program executes on a CPU. And data can be transferred between CPU and GPU.

CUDA学习(2)--编程模型

CPU(the host) executes the serial code and the GPU executes the parallel kernels.