opencl – Tarik Billa

OpenCL: work group concept

April 9, 2024 by Tarik

Part of the confusion here I think comes down to terminology. What GPU people often call cores, aren’t really, and what GPU people often call threads are only in a certain sense. Cores A core, in GPU marketing terms may refer to something like a CPU core, or it may refer to a single lane … Read more

How to use OpenCL on Android?

December 26, 2023 by Tarik

OpenCL vs OpenMP performance [closed]

December 16, 2023 by Tarik

The benchmarks I’ve seen indicate that OpenCL and OpenMP running on the same hardware are usually comparable in performance, or OpenMP has slightly better performance. However, I haven’t seen any benchmarks that I would consider conclusive, because they’ve been mostly lacking in detailed explanations of their methodology. However, there are a few useful things to … Read more

Can I program Nvidia’s CUDA using only Python or do I have to learn C?

September 11, 2023 by Tarik

You should take a look at CUDAmat and Theano. Both are approaches to writing code that executes on the GPU without really having to know much about GPU programming.

OpenCL, Vulkan, Sycl

August 25, 2023 by Tarik

How does OpenCL relates to vulkan ? I understand that OpenCL is higher level and abstracts the devices, but does ( or could ) it uses Vulkan internally ? They’re not related to each other at all. Well, they do technically use the same intermediate shader language, but Vulkan forbids the Kernel execution model, and … Read more

Causes for CL_INVALID_WORK_GROUP_SIZE

August 18, 2023 by Tarik

CL_DEVICE_MAX_WORK_GROUP_SIZE should return a single size_t value (for example 512, but I don’t know what it’d be on your system). This is the maximum number of work-items in a work-group, not the maximum in each dimension. So in your case you are trying to make a 2D work-group with 32*32 = 1024 work-items, and presumably … Read more

library is linked but reference is undefined

August 14, 2023 by Tarik

when you are linking, the order of your libraries and source files makes a difference. for example for your case, g++ -I/usr/local/cuda/include -L/usr/lib/nvidia-current -lOpenCL opencl.cpp functions defined in the OpenCL library might not be loaded, since there nothing before them asking for a look-up. however if you use, g++ opencl.cpp -I/usr/local/cuda/include -L/usr/lib/nvidia-current -lOpenCL then any … Read more

How to get a “random” number in OpenCL

August 9, 2023 by Tarik

I was solving this “no random” issue for last few days and I came up with three different approaches: Xorshift – I created generator based on this one. All you have to do is provide one uint2 number (seed) for whole kernel and every work item will compute his own rand number // ‘randoms’ is … Read more

Debugger for OpenCL [closed]

July 25, 2023 by Tarik

You may also want to look into CodeXL: https://gpuopen.com/compute-product/codexl/ CodeXL was originally developed by AMD, but was later released as an open-source project.

Using Keras & Tensorflow with AMD GPU

June 28, 2023 by Tarik

I’m writing an OpenCL 1.2 backend for Tensorflow at https://github.com/hughperkins/tensorflow-cl This fork of tensorflow for OpenCL has the following characteristics: it targets any/all OpenCL 1.2 devices. It doesnt need OpenCL 2.0, doesnt need SPIR-V, or SPIR. Doesnt need Shared Virtual Memory. And so on … it’s based on an underlying library called ‘cuda-on-cl’, https://github.com/hughperkins/cuda-on-cl cuda-on-cl … Read more