GPU Teaching Kit GPU Teaching Kit GPU Teaching Kit Accelerated Computing Lecture 3.2 – CUDA Parallelism Model Multidimensional Kernel Configuration
Objective – To understand multidimensional Grids – Multi-dimensional block and thread indices – Mapping block/thread indices to data indices 2 2
A Multi-Dimensional Grid Example device host Block Block Grid 1 (0, 0) (0, 1) Kernel 1 Block Block (1, 0) (1, 1) Block (1,0) Grid 2 (1,0,0) (1,0,1) (1,0,2) (1,0,3) Thread Thread Thread Thread (0,0,0) (0,0,1) (0,0,2) (0,0,3) Thread Thread Thread Thread Thread (0,0,0) (0,1,0) (0,1,1) (0,1,2) (0,1,3) 3 3
Processing a Picture with a 2D Grid 16 � 16 blocks 62 � 76 picture 4
Row-Major Layout in C/C++ M R ow*Width+Col = 2*4+1 = 9 M 0 M 1 M 2 M 3 M 4 M 5 M 6 M 7 M 8 M 9 M 10 M 11 M 12 M 13 M 14 M 15 M M 0,0 M 0,1 M 0,2 M 0,3 M 1,0 M 1,1 M 1,2 M 1,3 M 2,0 M 2,1 M 2,2 M 2,3 M 3,0 M 3,1 M 3,2 M 3,3 M 0,0 M 0,1 M 0,2 M 0,3 M 1,0 M 1,1 M 1,2 M 1,3 M 2,0 M 2,1 M 2,2 M 2,3 M 3,0 M 3,1 M 3,2 M 3,3 5
Source Code of a PictureKernel __global__ void PictureKernel(float* d_Pin, float* d_Pout, int height, int width) { // Calculate the row # of the d_Pin and d_Pout element int Row = blockIdx.y*blockDim.y + threadIdx.y; // Calculate the column # of the d_Pin and d_Pout element int Col = blockIdx.x*blockDim.x + threadIdx.x; // each thread computes one element of d_Pout if in range if ((Row < height) && (Col < width)) { d_Pout[Row*width+Col] = 2.0*d_Pin[Row*width+Col]; } } S cale every pixel value by 2.0 6
Host Code for Launching PictureKernel // assume that the picture is m � n, // m pixels in y dimension and n pixels in x dimension // input d_Pin has been allocated on and copied to device // output d_Pout has been allocated on device … dim3 DimGrid((n-1)/16 + 1, (m-1)/16+1, 1); dim3 DimBlock(16, 16, 1); PictureKernel<<<DimGrid,DimBlock>>>(d_Pin, d_Pout, m, n); … 7
Covering a 62 � 76 Picture with 16 � 16 Blocks Not all threads in a Block will follow the same control flow path. 8
GPU Teaching Kit GPU Teaching Kit Accelerated Computing The GPU Teaching Kit is licensed by NVIDIA and the University of Illinois under the Creative Commons Attribution-NonCommercial 4.0 International License.
Recommend
More recommend