OpenCL vs OpenMP performance [closed]

The benchmarks I’ve seen indicate that OpenCL and OpenMP running on the same hardware are usually comparable in performance, or OpenMP has slightly better performance. However, I haven’t seen any benchmarks that I would consider conclusive, because they’ve been mostly lacking in detailed explanations of their methodology. However, there are a few useful things to … Read more

OpenCL, Vulkan, Sycl

How does OpenCL relates to vulkan ? I understand that OpenCL is higher level and abstracts the devices, but does ( or could ) it uses Vulkan internally ? They’re not related to each other at all. Well, they do technically use the same intermediate shader language, but Vulkan forbids the Kernel execution model, and … Read more

Causes for CL_INVALID_WORK_GROUP_SIZE

CL_DEVICE_MAX_WORK_GROUP_SIZE should return a single size_t value (for example 512, but I don’t know what it’d be on your system). This is the maximum number of work-items in a work-group, not the maximum in each dimension. So in your case you are trying to make a 2D work-group with 32*32 = 1024 work-items, and presumably … Read more

library is linked but reference is undefined

when you are linking, the order of your libraries and source files makes a difference. for example for your case, g++ -I/usr/local/cuda/include -L/usr/lib/nvidia-current -lOpenCL opencl.cpp functions defined in the OpenCL library might not be loaded, since there nothing before them asking for a look-up. however if you use, g++ opencl.cpp -I/usr/local/cuda/include -L/usr/lib/nvidia-current -lOpenCL then any … Read more

How to get a “random” number in OpenCL

I was solving this “no random” issue for last few days and I came up with three different approaches: Xorshift – I created generator based on this one. All you have to do is provide one uint2 number (seed) for whole kernel and every work item will compute his own rand number // ‘randoms’ is … Read more

Using Keras & Tensorflow with AMD GPU

I’m writing an OpenCL 1.2 backend for Tensorflow at https://github.com/hughperkins/tensorflow-cl This fork of tensorflow for OpenCL has the following characteristics: it targets any/all OpenCL 1.2 devices. It doesnt need OpenCL 2.0, doesnt need SPIR-V, or SPIR. Doesnt need Shared Virtual Memory. And so on … it’s based on an underlying library called ‘cuda-on-cl’, https://github.com/hughperkins/cuda-on-cl cuda-on-cl … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)