OpenCL 1.2 and C++ Static kernel language now available with AMD APP SDK
Beginning with the AMD OpenCL™ APP SDK 2.6 availability back in December of 2011 AMD has been making available preview versions of both OpenCL™ 1.2 support and improved C++ support for both host side and kernel side coding. With recent release of the AMD OpenCL™ APP SDK 2.7 these capabilities are now fully supported in the SDK and fully integrated into the run-time support delivered via the AMD Catalyst™ software drivers. AMD also continues to demonstrate leadership in OpenCL™ by being first to submit for ratification what we believe is a fully conformant1 OpenCL 1.2 solution for both CPU and GPU. I am also excited that AMD now supports both the C++ wrapper AP, and the AMD extension to support the C++ kernel language enabling complete application development using C++ capabilities, removing the need for much of the OpenCL™ API boilerplate function calls in the host code while at the same time and improving type checking of kernel parameters.
Download AMD OpenCL™ APP SDK 2.7 now from http://developer.amd.com/appsdk .
In addition to the above we have updated to gDEBugger, APP Profiler, Kernel Analyzer and APP ML, and there are numerous new and improved samples. We are continuing to work on our samples and new samples will be posted on http://developer.amd.com/sdks/AMDAPPSDK/samples/Pages/default.aspx as they become available over the next few months.
The OpenCL™ 1.2 adds the following key capabilities
- Host access flags for memory objects enable more efficient buffer handling and provide added protection. For example, a buffer that is created as “write only” cannot be read from the host.
- Pattern based GPU buffer and image initialization can help eliminate need for certain buffer/image transfers
- Memory objects migration supports transfer of buffers prior to need
- New generalized image creation API
- Enhanced image/buffer map operations
- OpenCL 1.2 CPU device partition including partition of a CPU after addition to a context
- Generalized 1D and 2D images, image arrays, and image<-> buffer interop
- Libraries support including the separation of compile and link phases and the ability to compile
The C++ Wrapper API provide the following new capabilities
- Defaults for platform, queue, device, … helping to significantly reduce the amount of boilerplate code required.
- Improved simplified constructors for cl::Buffer and addition of cl::copy functions
- Additional support for events to functors
Notable C++ features that are supported by the OpenCL™ Static C++ Kernel language
- Kernel and function overloading
- Inheritance
- Strict inheritance
- Friend classes
- Multiple inheritance
- Templates:
- Kernel templates
- Member templates
- Template default argument
- Limited class templates (the “virtual” keyword is not exposed)
- Partial template specialization
- Namespaces
- References
- ‘this’ operator
- with external symbols
- Kernel reflection, the ability to query a kernel’s arguments
- Support for printf as a built in function
Additional features supported in SDK 2.7 and the Catalyst 12.4 drivers include:
- Support for Asynchronous PCI transfers
- Video encode using VCE Encode (Win7)
- Open Encode update (12.4)
- Cl_khr_fp64 is now supported on AMD Radeon HD™ 7900 series devices (“Cayman”)
- Added OpenGL™ interoperability under Linux for AMD Radeon HD™ 7000 series devices
- Stability Improvements
- Performance improvements
- Support for AMD Radeon HD™ 7000 series devices (“Southern Islands”) NPI
- Support for AMD’s Second Generation APUs (“Trinity”)
- Kernel Analyzer v1.12
- APP Profiler v2.5
gDEBugger version 6.2; downloaded for use with this SDK from http://developer.amd.com/gDEBugger.
- Introducing Linux® Support
- New standalone user interface for both Linux® and Windows®, with enhancements for better navigation and ease of use
- Supports OpenCL™ kernel and API level debugging on AMD Radeon™ HD 7000 series graphics cards
- Supports OpenCL™ 1.2 beta drivers
- Automatic updater to notify and download new product updates
- Feature enhancements including support for static arrays, union variables and Find feature
- Stability improvements
APP KernelAnalyzer v 1.12
- Support for Catalyst revisions through 12.1 – 12.4.
APP Profiler v2.5 includes several key new features, including:
- Stability improvements
APP ML 1.8
- Support for real to complex FFT
New and updated samples
- Nbody: optimized for improved performance
- DeviceFission: a new version of this sample using OpenCL 1.2 Device Fission capabilities. The old version is still included but renamed as DeviceFission11Ext
- ImageOverlap and GaussianNoiseGL are two new OpenCL™ 1.2 samples
- DwtHaar1DCPPKernel: an additional version of DwtHaar1D but modified to use the C++ kernel language
- MatrixMultiplicationCPPKernel: an additional version of MatrixMultiplication but modified to use the C++ kernel language. This sample supports multiplication of both int and float matrices through use of a template.
- TransferOverlapCPP: an additional version of TransferOverlapCPP but modified to use the C++ wrapper API
- The URNGNoiseGL and HistogramAtomics samples have been modified to use the C++ wrapper API
- The FFT, MersenneTwister, and EigenValue samples have been modified to use C++ kernel language
- There have been incremental improvements to a number of additional samples
[via AMD blog]
Category: Software






