cuda – Page 4 – Tarik Billa

Error compiling CUDA from Command Prompt

March 26, 2023 by Tarik

You will need to add the folder containing the “cl.exe” file to your path environment variable. For example: C:\Program Files\Microsoft Visual Studio 10.0\VC\bin Edit: Ok, go to My Computer -> Properties -> Advanced System Settings -> Environment Variables. Here look for “PATH” in the list, and add the path above (or whatever is the location … Read more

Where did CUDA get installed on Ubuntu 14.04 on my computer?

March 13, 2023 by Tarik

Usually, it is /usr/local/cuda. If this is not the case, you can try to locate cuda. If you want to find directories only, run locate cuda | grep /cuda$ or find / -type d -name cuda 2>/dev/null For me, it turned out to be in /opt/cuda-7.5

CUDA determining threads per block, blocks per grid

March 2, 2023 by Tarik

In general you want to size your blocks/grid to match your data and simultaneously maximize occupancy, that is, how many threads are active at one time. The major factors influencing occupancy are shared memory usage, register usage, and thread block size. A CUDA enabled GPU has its processing capability split up into SMs (streaming multiprocessors), … Read more

GPU Programming, CUDA or OpenCL? [closed]

February 16, 2023 by Tarik

If you use OpenCL, you can easily use it both on Windows and Linux because having display drivers is enough to run OpenCL programs and for programming you would simply need to install the SDK. CUDA has more requirements on specific GCC versions etc. But it is not much more difficult to install on Linux … Read more

When to call cudaDeviceSynchronize?

January 30, 2023 by Tarik

Although CUDA kernel launches are asynchronous, all GPU-related tasks placed in one stream (which is the default behavior) are executed sequentially. So, for example, kernel1<<<X,Y>>>(…); // kernel start execution, CPU continues to next statement kernel2<<<X,Y>>>(…); // kernel is placed in queue and will start after kernel1 finishes, CPU continues to next statement cudaMemcpy(…); // CPU … Read more

How can I flush GPU memory using CUDA (physical reset is unavailable)

January 27, 2023 by Tarik

check what is using your GPU memory with sudo fuser -v /dev/nvidia* Your output will look something like this: USER PID ACCESS COMMAND /dev/nvidia0: root 1256 F…m Xorg username 2057 F…m compiz username 2759 F…m chrome username 2777 F…m chrome username 20450 F…m python username 20699 F…m python Then kill the PID that you no … Read more

In CUDA, what is memory coalescing, and how is it achieved?

January 18, 2023 by Tarik

It’s likely that this information applies only to compute capabality 1.x, or cuda 2.0. More recent architectures and cuda 3.0 have more sophisticated global memory access and in fact “coalesced global loads” are not even profiled for these chips. Also, this logic can be applied to shared memory to avoid bank conflicts. A coalesced memory … Read more

nvidia-smi Volatile GPU-Utilization explanation?

January 14, 2023 by Tarik

It is a sampled measurement over a time period. For a given time period, it reports what percentage of time one or more GPU kernel(s) was active (i.e. running). It doesn’t tell you anything about how many SMs were used, or how “busy” the code was, or what it was doing exactly, or in what … Read more

Streaming multiprocessors, Blocks and Threads (CUDA)

January 14, 2023 by Tarik

The thread / block layout is described in detail in the CUDA programming guide. In particular, chapter 4 states: The CUDA architecture is built around a scalable array of multithreaded Streaming Multiprocessors (SMs). When a CUDA program on the host CPU invokes a kernel grid, the blocks of the grid are enumerated and distributed to … Read more

NVIDIA vs AMD: GPGPU performance

January 2, 2023 by Tarik

Metaphorically speaking ati has a good engine compared to nvidia. But nvidia has a better car 😀 This is mostly because nvidia has invested good amount of its resources (in money and people) to develop important libraries required for scientific computing (BLAS, FFT), and then a good job again in promoting it. This may be … Read more