General-purpose computing on graphics processing units (GPGPU) is dramatically changing the landscape of high performance computing in astronomy. In this paper, we identify and investigate several key decision areas, with a goal of simplifying the early adoption of GPGPU in astronomy. We consider the merits of OpenCL as an open standard in order to reduce risks associated with coding in a native, vendor-specific programming environment, and present a GPU programming philosophy based on using brute force solutions. We assert that effective use of new GPU-based supercomputing facilities will require a change in approach from astronomers. This will likely include improved programming training, an increased need for software development best practice through the use of profiling and related optimisation tools, and a greater reliance on third-party code libraries. As with any new technology, those willing to take the risks and make the investment of time and effort to become early adopters of GPGPU in astronomy, stand to reap great benefits.
In this paper, we have highlighted some of the benefits and limitations of early adoption of GPGPU for astronomy. While there are risks and significant effort may be required to prepare codes, in many cases the benefits will outweigh the limitations. A preferred outcome for astronomers is a majority of time and effort spent on scientific outcomes rather than software development.
The promise of the OpenCL standard, is to provide opportunities for hardware-agnostic coding. OpenCL seems to present a good amount of flexibility for implementation, rather than using a native API (such as CUDA for NVIDIA), without a significant decrease in processing speed. Furthermore, we suggest that for certain classes of scientific computations a step backwards to consider simple, brute force solutions that were not feasible for CPU, may instead reduce software development times. The resulting codes may already be ‘no worse’ than the best single-core alternatives, and may even be more accurate, or overcome limitations of existing optimised approaches. Lessons learned in starting with brute force solutions can then help researchers to determine whether a longer-term solution does indeed warrant the effort of implementing a more sophisticated alternative.
While running codes faster may be an end in itself, faster computation means that there is more time to explore parameter space. This might include running models with different parameters, or running repeat models with different random seeds in order to build up a more robust statistical sample. Additionally, GPUs provide opportunities to tackle computational problems that are still not feasible on single core CPU or traditional multi-core computing clusters, at a greatly reduced cost.
Not all applications require GPUs, so some time and effort should be invested in understanding the types of problems that will really achieve the greatest benefit. For example, telescope control software does not parallelise well, if at all, but the highly parallel nature of Fourier transformation, used extensively in astronomy, makes it an ideal candidate for GPGPU. Indeed, there are problems, such as the conceptually simple process of generating a histogram from data values, that are easy to implement on a CPU, but which become unnecessarily complex when attempting to find a parallel solution. Computational tasks compatible with a stream processing paradigm, i.e. many individual data-streams requiring identical computations, are candidates for moving from the CPU to the GPU. Fortunately, a high degree of data parallelism is present in many astronomy scenarios (e.g. the use of the CLEAN algorithm in radio astronomy, which takes advantage of data parallelism in the spectral domain).
The long-term role of the GPU is still unknown: whether they will remain as a computational coprocessor, or if multi-core CPUs will grow to become more GPU-like. There may be other radical changes in hardware in the years ahead, such as the experimental 48-core Intel single-chip cloud computer announced in 2009. The recent demise of Intel’s Larabee consumer GPU chip, a hybrid CPU and GPU, with features such as cache coherency across all cores and greater flexibility in computation, may have delayed resolution of this issue for at least a few more years. While these short term changes may lead to some redundancy in code development effort, awareness of the fundamental differences between CPU and GPU programming and execution should provide insight into problem solving for future highly parallel architectures. Moreover, we anticipate that the move to astronomical GPGPU may not be limited to HPC facilities, but will ultimately encompass desktop and notebook supercomputing. GPGPU represents a natural new direction for astrophysical HPC. Adoption of a radical new processing architecture, and the correspoding required change in approach to software development, is worthwhile if our understanding of the universe advances at an accelerated rate. We remain enthusiastic about the prospects for GPGPU in astronomy.