What does #pragma unroll do exactly? Does it affect the number of threads?
No. It means you have called a CUDA kernel with one block and that one block has 100 active threads. You’re passing size as the second function parameter to your kernel. In your kernel each of those 100 threads executes the for loop 100 times. #pragma unroll is a compiler optimization that can, for example, … Read more