The dim3 data type
|
A example of a 2-dimensional grid shape:
|
How to define a 3×2 grid shape in CUDA:
|
The values of the identifying variables of each thread block in the 3×2 grid shape:
|
A example of a 2-dimensional (thread) block shape:
|
How to define a 2×3 thread block shape in CUDA:
|
The values of the identifying variables of each thread in the 2×3 thread block shape:
|
The values of the identifying variables of a (specific) thread in the 2×3 block inside a 3×2 grid:
|
The values of the identifying variables of a (specific) thread in the 2×3 block inside a 3×2 grid:
|
DEMO program that shows the identifying variables of threads in a 2-dim grid and thread block:
#include <stdio.h>
#include <unistd.h>
__global__ void hello( )
{
printf("grid coord: (%d,%d), thread coord: (%d,%d),
grid dim: (%d,%d), block dim: (%d,%d)\n",
blockIdx.x, blockIdx.y, threadIdx.x, threadIdx.y,
gridDim.x, gridDim.y, blockDim.x, blockDim.y);
}
int main()
{
dim3 gridShape = dim3( 3, 2 );
dim3 blockShape = dim3( 2, 3 );
hello<<< gridShape, blockShape>>>( );
printf("I am the CPU: Hello World ! \n");
cudaDeviceSynchronize();
}
|
DEMO: /home/cs355001/demo/CUDA/1-intro/hello-2dim.cu
Review: how to compute a unique ID for the x-dimension
|
The expression blockIdx.x × blockDim.x + threadIdx.x can be used to compute uniqueID for x-dimension in a 2 dimensional shaped execution configuration :
|
Row 1
(for all columns):
0 × 2
+ 0 = 0
Row 2
(for all columns):
0 × 2
+ 1 = 1
Row 3
(for all columns):
1 × 2
+ 0 = 3
and so on
We can use a similar procedure to compute a unique ID for the y-dimension:
|
The expression blockIdx.y × blockDim.y + threadIdx.y can be used to compute uniqueID for y-dimension in a 2 dimensional shaped execution configuration :
|
Column 1
(for all rows):
0 × 3
+ 0 = 0
Column 2
(for all rows):
0 × 3
+ 1 = 1
Column 3
(for all rows):
1 × 3
+ 2 = 2
Column 4
(for all rows):
1 × 3
+ 0 = 3
and so on
DEMO program that shows how to compute unique ID for threads in a 2-dim grid and thread block:
#include <stdio.h>
#include <unistd.h>
__global__ void hello( )
{
printf("blockIdx:(%d,%d), threadIdx:(%d,%d) -> Row,Col=(%d,%d)\n",
blockIdx.x, blockIdx.y, threadIdx.x, threadIdx.y,
blockIdx.x * blockDim.x + threadIdx.x, // rowID
blockIdx.y * blockDim.y + threadIdx.y); // columnID
}
int main()
{
dim3 blockShape = dim3( 2, 3 );
dim3 gridShape = dim3( 3, 2 );
hello<<< gridShape, blockShape>>>( );
printf("I am the CPU: Hello World ! \n");
cudaDeviceSynchronize();
}
|
DEMO: /home/cs355001/demo/CUDA/1-intro/hello-2dim-ID.cu
|