CUDA incompatible with my gcc version

As already pointed out, nvcc depends on gcc 4.4. It is possible to configure nvcc to use the correct version of gcc without passing any compiler parameters by adding softlinks to the bin directory created with the nvcc install. The default cuda binary directory (the installation default) is /usr/local/cuda/bin, adding a softlink to the correct … Read more

Difference between global and device functions

Global functions are also called “kernels”. It’s the functions that you may call from the host side using CUDA kernel call semantics (<<<…>>>). Device functions can only be called from other device or global functions. __device__ functions cannot be called from host code.

Using Java with Nvidia GPUs (CUDA)

First of all, you should be aware of the fact that CUDA will not automagically make computations faster. On the one hand, because GPU programming is an art, and it can be very, very challenging to get it right. On the other hand, because GPUs are well-suited only for certain kinds of computations. This may … Read more

How do CUDA blocks/warps/threads map onto CUDA cores?

Two of the best references are NVIDIA Fermi Compute Architecture Whitepaper GF104 Reviews I’ll try to answer each of your questions. The programmer divides work into threads, threads into thread blocks, and thread blocks into grids. The compute work distributor allocates thread blocks to Streaming Multiprocessors (SMs). Once a thread block is distributed to a … Read more

Understanding CUDA grid dimensions, block dimensions and threads organization (simple explanation) [closed]

Hardware If a GPU device has, for example, 4 multiprocessing units, and they can run 768 threads each: then at a given moment no more than 4*768 threads will be really running in parallel (if you planned more threads, they will be waiting their turn). Software threads are organized in blocks. A block is executed … Read more

Using GPU from a docker container?

Regan’s answer is great, but it’s a bit out of date, since the correct way to do this is avoid the lxc execution context as Docker has dropped LXC as the default execution context as of docker 0.9. Instead it’s better to tell docker about the nvidia devices via the –device flag, and just use … Read more

Different CUDA versions shown by nvcc and NVIDIA-smi

CUDA has 2 primary APIs, the runtime and the driver API. Both have a corresponding version (e.g. 8.0, 9.0, etc.) The necessary support for the driver API (e.g. libcuda.so on linux) is installed by the GPU driver installer. The necessary support for the runtime API (e.g. libcudart.so on linux, and also nvcc) is installed by … Read more

Which TensorFlow and CUDA version combinations are compatible?

TL;DR) See this table: https://www.tensorflow.org/install/source#gpu Generally: Check the CUDA version: cat /usr/local/cuda/version.txt and cuDNN version: grep CUDNN_MAJOR -A 2 /usr/local/cuda/include/cudnn.h and install a combination as given below in the images or here. The following images and the link provide an overview of the officially supported/tested combinations of CUDA and TensorFlow on Linux, macOS and Windows: … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)