A novel and scalable Multigrid algorithm for many-core architectures
Multigrid algorithms are among the fastest iterative methods known today for solving large linear and some non-linear systems of equations. Greatly optimized for serial operation, they still have a great potential for parallelism not fully realized. In this work, we present a novel multigrid algorithm designed to work entirely inside many-core architectures like the graphics processing units (GPUs), without memory transfers between the GPU and the central processing unit (CPU), avoiding low bandwitdth communications. The algorithm is denoted as the high occupancy multigrid (HOMG) because it makes use of entire grid operations with interpolations and relaxations fused into one task, providing useful work for every thread in the grid. For a given accuracy, its number of operations scale linearly with the total number of nodes in the grid. Perfect scalability is observed for a large number of processors.
Conclusion
In this article, we present a novel multigrid algorithm specially designed to work entirely inside many-core architectures like GPUs without memory transfers between the GPU and the CPU. The algorithm makes use of entire grid operations even for coarse grid corrections. Interpolations and relaxations are fused into one task giving useful work for every thread in the grid. In this way the algorithm has full multithreading and fix memory patterns, allowing the full exploitation of the fast memory models of the GPU, efficiently hiding cash misses and memory latencies.
The algorithm is denoted as the high occupancy multigrid (HOMG) algorithm because multithreading and useful work per thread are kept constantly high. The algorithm is combined with a modified full multigrid cycle (MFMG) to reach a high efficiency. For a given accuracy, the operations of the HOMG scale linearly with the total number of nodes. Perfect scalability is observed
for a large number of processors and large grids.
Julian Becerra-Sagredo, Carlos Malaga, Francisco Mandujano. Reprint [arXiv:1108.2045v1] [pdf]
Category: Articles, Computer Science, Physical Science






