Jill Reese and Sarah Zaranek of MathWorks just published introductory article on GPU Programming in MATLAB. This article demonstrates features in Parallel Computing Toolbox that enable you to run your MATLAB code on a GPU by making a few simple changes to your code. Authors illustrate this approach by solving a second-order wave equation using spectral methods. Grab a PDF of full article to read. Below we just highlighted several interesting excerpts from the article:
Comparing CPU and GPU Execution Speeds
To evaluate the benefits of using the GPU to solve second-order wave equations, we ran a benchmark study in which we measured the amount of time the algorithm took to execute 50 time steps for grid sizes of 64, 128, 512, 1024, and 2048 on an Intel Xeon Processor X5650 and then using an NVIDIATesla C2050 GPU.
For a grid size of 2048, the algorithm shows a 7.5x decrease in compute time from more than a minute on the CPU to less than 10 seconds on the GPU (Figure 1). The log scale plot shows that the CPU is actually faster for small grid sizes. As the technology evolves and matures, however, GPU solutions are increasingly able to handle smaller problems, a trend that we expect to continue.
Figure 1. Plot of benchmark results showing the time required to complete 50 time steps for different grid sizes, using either a linear scale (left) or a log scale (right).
Advanced GPU Programming with MATLAB
Parallel Computing Toolbox provides a straightforward way to speed up MATLAB code by executing it on a GPU. You simply change the data type of a function’s input to take advantage of the many MATLAB commands that have been overloaded for GPUArrays. (A complete list of built-in MATLAB functions that support GPUArray is available in the Parallel Computing Toolbox documentation.)
To accelerate an algorithm with multiple simple operations on a GPU, you can use arrayfun, which applies a function to each element of an array. Because arrayfun is a GPU-enabled function, you incur the memory transfer overhead only on the single call to arrayfun, not on each individual operation.
Finally, experienced programmers who write their own CUDA code can use the CUDAKernel interface in Parallel Computing Toolbox to integrate this code with MATLAB. The CUDAKernel interface enables even more fine-grained control to speed up portions of code that were performance bottlenecks. It creates a MATLAB object that provides access to your existing kernel compiled into PTX code (PTX is a low-level parallel thread execution instruction set). You then invoke the feval command to evaluate the kernel on the GPU, using MATLAB arrays as input and output.
Engineers and scientists are successfully employing GPU technology, originally intended for accelerating graphics rendering, to accelerate their discipline-specific calculations. With minimal effort and without extensive knowledge of GPUs, you can now use the promising power of GPUs with MATLAB. GPUArrays and GPU-enabled MATLAB functions help you speed up MATLAB operations without low-level CUDA programming. If you are already familiar with programming for GPUs, MATLAB also lets you integrate your existing CUDA kernels into MATLAB applications without requiring any additional C programming.
To achieve speedups with the GPUs, your application must satisfy some criteria, among them the fact that sending the data between the CPU and GPU must take less time than the performance gained by running on the GPU. If your application satisfies these criteria, it is a good candidate for the range of GPU functionality available with MATLAB.