Embedded Linux using yocto project

Hello All, In my previous post related to Introduction to GPU programming using NVIDA CUDA Tool Kit(link in comment section) I have explained about how to write a simple program(Performing addition of two arrays) using CUDA. In this post let us understand how the CUDA kernel will launch with provided block dimension and grid dimensions parameters. A GPU will follows a single instruction multiple thread(SIMT) architecture it means that the multiple threads are issued for processing the same instruction. These threads are organized in to blocks and blocks are organized in to grids. Let us consider the example of CUDA Hello World Program where we launch the CUDA kernel with total number of threads in a block as 1 and there is 1 such block in a grid. HelloWorld.cu //Pre-processor directives #include <stdio.h> #include "cuda_runtime.h" #include "device_launch_parameters.h" //Device code __global__ void cuda_kernel() { printf("Hello World!"); } //Hos...

Search This Blog

Embedded Linux using yocto project

Posts

CUDA Kernel Launch