OpenCL is usually perceived as a C dialect for GPGPU programming – doing “general-purpose” computations (not necessarily graphics) on GPU hardware. “It’s like Nvidia’s CUDA C, but portable”. However, OpenCL the language is not really tied to the GPU architecture. That is, hardware could run OpenCL programs and have an architecture very different from a GPU, resulting in a very different performance profile.
OpenCL is possibly the first programming language promising portability across accelerators: “OpenCL is for accelerators what C is for CPUs”. Portability is disruptive. When hardware vendor A displaces vendor B, portable software usually helps a great deal. Will OpenCL – “the GPGPU language” – eventually help displace GPGPU, by facilitating “GP-something-else” – “general-purpose” accelerators which aren’t like GPUs? Yossi discussed this question on general grounds, and consider two specific examples of recent OpenCL accelerators: Adapteva’s Parallella and ST’s P2012.
Why displace GPGPU?
First of all, whether GPGPU is likely to be displaced or not – what could “GP-something-else” possibly give us that GPGPU doesn’t?
There are two directions from which benefits could come – you could call them two opposite directions:
- Let software (ab)use more types of special-purpose accelerators. GPGPU lets you utilize (abuse?) your GPU for general-purpose stuff. It could be nice to have “GPDSP” to utilize the DSPs in your phone, “GPISP” to utilize the ISP, “GPCVP” to utilize computer vision accelerators likely to emerge in the future, etc. From GPGPU to GP-everything.
- Give software accelerators which are more general-purpose to begin with. GPGPU means doing your general-purpose stuff under the constraints imposed by the GPU architecture. An OpenCL accelerator lifting some of these constraints could be very welcome.
Could OpenCL help us get benefits from any of the directions (1) and (2)?
(1) is about making use of anal-retentive, efficiency-obsessed, weird, incompatible hardware. It’s rather hard, for OpenCL or for any other portable, reasonably “pretty” language.
OpenCL does provide constructs more or less directly mapping to some of the “ugly” features common to many accelerators, for example:
- Explicitly addressed local memory (as opposed to cache)
- DMA (bulk memory transfers)
- Short vector data types to make use of SIMD opcodes
- Light-weight threads and barriers
But even with GPUs, OpenCL can’t target all of the GPU’s resources. There’s the subset of the GPU accessible to GPGPU programs – and then there are the more idiosyncratic and less flexible parts used for actual graphics processing.
Category: Computer Science