In this study, the application of the two-dimensional direct simulation Monte Carlo (DSMC) method using an MPI-CUDA parallelization paradigm on Graphics Processing Units (GPUs) clusters is presented. An all-device (i.e. GPU) computational approach is adopted where the entire computation is performed on the GPU device, leaving the CPU idle during all stages of the computation, including particle moving, indexing, particle collisions and state sampling. Communication between the GPU and host is only performed to enable multiple-GPU computation. Results show that the computational expense can be reduced by 15 and 185 times when using a single GPU and 16 GPUs respectively when compared to a single core of an Intel Xeon X5670 CPU. The demonstrated parallel efficiency is 75% when using 16 GPUs as compared to a single GPU for simulations using 30 million simulated particles. Finally, several very large-scale simulations in the near-continuum regime are employed to demonstrate the excellent capability of the current parallel DSMC method.
C.-C. Su, M.R. Smith, F.-A. Kuo, J.-S. Wu, C.-W. Hsieh, K.-C. Tseng. Large-scale simulations on multiple Graphics Processing Units (GPUs) for the direct simulation Monte Carlo method. Journal of Computational Physics. Volume 231, Issue 23, Pages 7932–7958, 2012. [doi: 10.1016/j.jcp.2012.07.038]