We present a way to improve the performance of the electronic structure Vienna Ab initio Simulation Package (VASP) program. We show that high-performance computers equipped with graphics processing units (GPUs) as accelerators may reduce drastically the computation time when offloading these sections to the graphic chips. The procedure consists of (i) profiling the performance of the code to isolate the time-consuming parts, (ii) rewriting these so that the algorithms become better-suited for the chosen graphic accelerator, and (iii) optimizing memory traffic between the host computer and the GPU accelerator. We chose to accelerate VASP with NVIDIA GPU using CUDA. We compare the GPU and original versions of VASP by evaluating the Davidson and RMM-DIIS algorithms on chemical systems of up to 1100 atoms. In these tests, the total time is reduced by a factor between 3 and 8 when running on n (CPU core + GPU) compared to n CPU cores only, without any accuracy loss.
We have provided a hybrid massively parallelized molecular dynamic ab initio software for GPUs clusters. To avoid continuously transferring data from CPUs (resp. GPUs) to GPUs (resp. CPUs), we have ported some functions in CUDA and achieved a balanced combination between CUFFT, CUBLAS, and CUDA. We have established a multi-GPU platform to improve the overall performance of the software. Indeed, on the B505 configuration, adding 16 NVIDIA GT200 GPUs to only 16 cores (out of 64) offers the computational power offered by the full 64 cores architecture, while leaving 48 cores available for other calculations. Moreover, putting a Tesla Fermi in a traditional machine improves the speedup of VASP by a factor between 3 and 8 (using a Xeon Q9450).
Mohamed Hacene, Ani Anciaux-Sedrakian, Xavier Rozanska, Diego Klahr, Thomas Guignon, Paul Fleurat-Lessard. Accelerating VASP electronic structure calculations using graphic processing units. Journal of Computational Chemistry. Early View, 2012. [DOI: 10.1002/jcc.23096]