We solve the ghost-gluon system of Yang-Mills theory using Graphics Processing Units (GPUs). Working in Landau gauge, we use the Dyson-Schwinger formalism for the mathematical description as this approach is well-suited to directly benefit from the computing power of the GPUs. With the help of a Chebyshev expansion for the dressing functions and a subsequent appliance of a Newton-Raphson method, the non-linear system of coupled integral equations is linearized. The resulting Newton matrix is generated in parallel using OpenMPI and CUDA(TM). Our results show, that it is possible to cut down the run time by two orders of magnitude as compared to a sequential version of the code. This makes the proposed techniques well-suited for Dyson-Schwinger calculations on more complicated systems where the Yang-Mills sector of QCD serves as a starting point. In addition, the computation of Schwinger functions using GPU devices is studied.
We performed a numerical analysis of the ghost-gluon Dyson-Schwinger equations (DSEs) of Yang-Mills theory. The truncated system of non-linear integral equations was solved with the help of a Chebyshev expansion for the dressing functions using subsequently a Newton-Raphson method to obtain a linear system. Here, the methods are ideally suited for an SIMD architecture as the problem decomposes into independent parts. The parallelization of the system was performed using OpenMPI and CUDATM. By comparing the two parallelization strategies we demonstrated the computational advantage of GPUs for this problem. Compared to a sequential version we obtained speed-ups of approximately two orders of magnitude already with a single consumer GPU. The presented results demonstrate convincingly the benefits of modern GPU devices in DSE calculations, and the proposed solution strategy offers a helpful toolbox. Last but not least, the generalization to larger systems is straightforward since additional DSEs can be incorporated by extending the Newton matrix with the corresponding derivatives. In this respect we provided a basis for on-going and future computations which uses the Yang-Mills DSE system as input. Here, the GPU version does and is expected to perform successively better with increasing workload.