One of the most annoying things about OpenCL was the automatic CPU fallback. Trying to figure out if I had it actually _working_ or not was a terrible first impression. It gave an output, sure, but was it slow because it was still hitting the CPU fallback? Had I failed to install the GPU driver I had attempted to install? Or was it slow because my kernel wasn't GPU-friendly? Or hit some slow path in OpenCL?
Meanwhile CUDA, being entirely GPU only, was a breeze. Did I get an output? OK then yes I got the basics working. Now on to making it sing.
I really don't understand this - it's very easy to set which device (CPU / GPU) your OpenCL code is running on and to avoid running on the CPU if you don't want that.
On the other hand the ability to test on GPU-less systems is very useful indeed.
OpenCL has lots of faults but the ability to use a CPU with the appropriate drivers when a GPU isn't available surely isn't one of them.
> On the other hand the ability to test on GPU-less systems is very useful indeed.
Ideally OpenCL's support for CPUs shouldn't just be for testing - it should be able to make efficient use of the CPU's SIMD.
Intel still support OpenCL for CPU. [0] AMD used to, but they dropped support for CPU devices, at least on Windows. Perhaps they still support them on Linux. [1][2]
I'm actually using AMD CPU drivers (OpenCL2 on Ubuntu 20.04) in a project that I hope will be in production over the next month or so. Speed is more than acceptable for my purposes.
Tried to use Intel but didn't manage to get working. AMD's were a breeze by comparison. I'd be very interested in hearing about others experiences and any benchmark AMD vs Intel comparisons.
Good to hear AMD haven't completely done away with their OpenCL-on-CPU.
Have you compared the performance against alternative technologies like OpenMP?
Years ago I had the Intel OpenCL running on an Intel CPU (the machine ran Windows) and AMD OpenCL running on an AMD CPU (that machine also ran Windows). AMD's Visual Studio plugin for OpenCL was really pretty good. Haven't used OpenCL since though so I can't really comment on the current state of things.
Using some OpenMP but the code is exceptionally well suited to the OpenCL Kernel approach plus the ability to explore use of GPUs at some point in future is helpful.
Thanks for the links - I'll have a look at the Intel drivers again and see if I have any more success!
Historically there was an emulator mode, where you could build your CUDA code to run on CPU. However, it was removed a few years ago. I never used it, so can't really give you more details beyond that.
One of the most annoying things about OpenCL was the automatic CPU fallback. Trying to figure out if I had it actually _working_ or not was a terrible first impression. It gave an output, sure, but was it slow because it was still hitting the CPU fallback? Had I failed to install the GPU driver I had attempted to install? Or was it slow because my kernel wasn't GPU-friendly? Or hit some slow path in OpenCL?
Meanwhile CUDA, being entirely GPU only, was a breeze. Did I get an output? OK then yes I got the basics working. Now on to making it sing.