GPU Linear Algebra Libs and GPGPU Programming for Accelerating MOPAC Semiempirical Quantum Chemistry Calculations
In this study, we present some modifications in the semiempirical quantum chemistry MOPAC2009 code that accelerate single-point energy calculations (1SCF) of medium-size (up to 2500 atoms) molecular systems using GPU coprocessors and multithreaded shared-memory CPUs. Our modifications consisted of using a combination of highly optimized linear algebra libraries for both CPU (LAPACK and BLAS from Intel MKL) and GPU (MAGMA and CUBLAS) to hasten time-consuming parts of MOPAC such as the pseudodiagonalization, full diagonalization, and density matrix assembling. We have shown that it is possible to obtain large speedups just by using CPU serial linear algebra libraries in the MOPAC code. As a special case, we show a speedup of up to 14 times for a methanol simulation box containing 2400 atoms and 4800 basis functions, with even greater gains in performance when using multithreaded CPUs (2.1 times in relation to the single-threaded CPU code using linear algebra libraries) and GPUs (3.8 times). This degree of acceleration opens new perspectives for modeling larger structures which appear in inorganic chemistry (such as zeolites and MOFs), biochemistry (such as polysaccharides, small proteins, and DNA fragments), and materials science (such as nanotubes and fullerenes). In addition, we believe that this parallel (GPU-GPU) MOPAC code will make it feasible to use semiempirical methods in lengthy molecular simulations using both hybrid QM/MM and QM/QM potentials.
We have developed and implemented new features in the MOPAC2009 program in order to allow it to run faster on GPUs and shared-memory CPUs. We modified it by replacing time-consuming parts of the semiempirical single point energy calculations with accelerated procedures using a combination of highly optimized linear algebra libraries for both CPU (LAPACK and BLAS from Intel MKL) and GPU (MAGMA and CUBLAS).
In short, we proposed both hybrid CPU-GPU and multithreaded CPU accelerated versions of the pseudodiagonalization, full diagonalization, and density matrix assembling methods in the MOPAC2009 code. These time-consuming parts of the SCF calculation were detected using a profiling tool to inspect a standard run of MOPAC2009.
We designed four different protocols in order to assess the real impact of our modifications on MOPAC legacy code to accelerate semiempirical single point energy calculations. For that, we have carried out calculations on different molecular systems comprised of small proteins, various solvent boxes, a fullerene, and nanotubes.
As a general conclusion, we can highlight the fact that just about every modification used brought some sort of gain to MOPAC’s performance, indicating that the use of optimized linear algebra libraries (LAPACK and BLAS) provides a very powerful way to boost it, either by itself or coupled with multithreaded CPU environments (Intel MKL) as well as their GPU counterparts (CUBLAS and MAGMA).
A possible application was shown in the single-point calculation of protein decoy sets in terms of canonical molecular orbitals. Around 2000 calculations of that type were performed at a rate of around 30 s per structure. Furthermore, PM6-DH+ was shown to be capable of identifying the native structure of a protein from a large set of decoy conformations by using just the enthalpies of formation in water solution.
Speedups of that magnitude for quantum chemical calculations open new perspectives of studying larger structures which appear in inorganic chemistry (as zeolites and MOFs), biochemistry (as polysaccharides, small proteins, and DNA fragments), and material science (as nanotubes and fullerenes). In addition, we believe that this parallel (CPU–GPU) MOPAC code will make possible the use of semiempirical methods to produce long molecular simulations using both hybrid QM/MM and QM/QM potentials.
Julio Daniel Carvalho Maia, Gabriel Aires Urquiza Carvalho, Carlos Peixoto Mangueira, Jr., Sidney Ramos Santana, Lucidio Anjos Formiga Cabral, and Gerd B. Rocha GPU Linear Algebra Libraries and GPGPU Programming for Accelerating MOPAC Semiempirical Quantum Chemistry Calculations. Journal of Chemical Theory and Computation Article ASAP, 2012. [DOI: 10.1021/ct3004645]