cuda – Page 10 – Tarik Billa

CUDA incompatible with my gcc version

December 9, 2022 by Tarik

As already pointed out, nvcc depends on gcc 4.4. It is possible to configure nvcc to use the correct version of gcc without passing any compiler parameters by adding softlinks to the bin directory created with the nvcc install. The default cuda binary directory (the installation default) is /usr/local/cuda/bin, adding a softlink to the correct … Read more

Difference between global and device functions

November 29, 2022 by Tarik

Global functions are also called “kernels”. It’s the functions that you may call from the host side using CUDA kernel call semantics (<<<…>>>). Device functions can only be called from other device or global functions. __device__ functions cannot be called from host code.

Using Java with Nvidia GPUs (CUDA)

November 14, 2022 by Tarik

First of all, you should be aware of the fact that CUDA will not automagically make computations faster. On the one hand, because GPU programming is an art, and it can be very, very challenging to get it right. On the other hand, because GPUs are well-suited only for certain kinds of computations. This may … Read more

How do CUDA blocks/warps/threads map onto CUDA cores?

November 11, 2022 by Tarik

Two of the best references are NVIDIA Fermi Compute Architecture Whitepaper GF104 Reviews I’ll try to answer each of your questions. The programmer divides work into threads, threads into thread blocks, and thread blocks into grids. The compute work distributor allocates thread blocks to Streaming Multiprocessors (SMs). Once a thread block is distributed to a … Read more

Understanding CUDA grid dimensions, block dimensions and threads organization (simple explanation) [closed]

October 26, 2022 by Tarik

Hardware If a GPU device has, for example, 4 multiprocessing units, and they can run 768 threads each: then at a given moment no more than 4*768 threads will be really running in parallel (if you planned more threads, they will be waiting their turn). Software threads are organized in blocks. A block is executed … Read more

A top-like utility for monitoring CUDA activity on a GPU

October 15, 2022 by Tarik

To get real-time insight on used resources, do: nvidia-smi -l 1 This will loop and call the view at every second. If you do not want to keep past traces of the looped call in the console history, you can also do: watch -n0.1 nvidia-smi Where 0.1 is the time interval, in seconds.

Using GPU from a docker container?

October 15, 2022 by Tarik

Regan’s answer is great, but it’s a bit out of date, since the correct way to do this is avoid the lxc execution context as Docker has dropped LXC as the default execution context as of docker 0.9. Instead it’s better to tell docker about the nvidia devices via the –device flag, and just use … Read more

How to verify CuDNN installation?

October 10, 2022 by Tarik

The installation of CuDNN is just copying some files. Hence to check if CuDNN is installed (and which version you have), you only need to check those files. Install CuDNN Step 1: Register an nvidia developer account and download cudnn here (about 80 MB). You might need nvcc –version to get your cuda version. Step … Read more

Different CUDA versions shown by nvcc and NVIDIA-smi

October 10, 2022 by Tarik

CUDA has 2 primary APIs, the runtime and the driver API. Both have a corresponding version (e.g. 8.0, 9.0, etc.) The necessary support for the driver API (e.g. libcuda.so on linux) is installed by the GPU driver installer. The necessary support for the runtime API (e.g. libcudart.so on linux, and also nvcc) is installed by … Read more

Which TensorFlow and CUDA version combinations are compatible?

October 2, 2022 by Tarik

TL;DR) See this table: https://www.tensorflow.org/install/source#gpu Generally: Check the CUDA version: cat /usr/local/cuda/version.txt and cuDNN version: grep CUDNN_MAJOR -A 2 /usr/local/cuda/include/cudnn.h and install a combination as given below in the images or here. The following images and the link provide an overview of the officially supported/tested combinations of CUDA and TensorFlow on Linux, macOS and Windows: … Read more