ABSTRACT
We present a Multi-GPU/CPU implementation of Deflated Preconditioned Conjugate Gradient (DPCG) to solve a highly ill-conditioned linear system arising from the discretized Pressure-correction equation on GPUs and CPUs. We discuss the challenges and choices in such an implementation with respect to communication and data layout. We present results of our implementation for systems having up to 16 million unknowns. Across 8 GPUs (on distinct nodes connected via MPI) our code achieves atleast 2 times speedup compared to 32 cores (across 4 distinct nodes connected via MPI). Comparing with 64 CPU cores across 8 nodes the same GPU version proves to be comparable in terms of wall-clock time.
- R. Gupta, M. B. van Gijzen, and C. Vuik. 3d bubbly flow simulation on the GPU - iterative solution of a linear system using sub-domain and level-set deflation. In Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on, pages 359--366, 2013. Google ScholarDigital Library
- Rohit Gupta, Martin B. van Gijzen, and Cornelis Vuik. Efficient two-level preconditioned conjugate gradient method on the GPU. In VECPAR, pages 36--49, 2012.Google Scholar
- D. A. Jacobsen and I. Senocak. A full-depth amalgamated parallel 3d geometric multigrid solver for GPU clusters. In 49th AIAA Aerospace Sciences Meeting. American Institute of Aeronautics and Astronautics (AIAA), 2011.Google ScholarCross Ref
- Marcel Kwakkel, Wim-Paul Breugem, and Bendiks Jan Boersma. An efficient multiple marker front-capturing method for two-phase flows. Computers & Fluids, 63(0):47--56, 2012.Google ScholarCross Ref
- Mathias Malandain, Nicolas Maheu, and Vincent Moureau. Optimization of the deflated conjugate gradient algorithm for the solving of elliptic equations on massively parallel machines. Journal of Computational Physics, 238:32--47, 2013. Google ScholarDigital Library
- J. M. Tang. Two-Level Preconditioned Conjugate Gradient Methods with Applications to Bubbly Flow Problems. PhD thesis, Delft University of Technology, Delft, The Netherlands, 2008.Google Scholar
- J. M. Tang and C. Vuik. Efficient deflation methods applied to 3-D bubbly flow problems. Electronic Transactions on Numerical Analysis, 26:330--349, 2007.Google Scholar
- Mickeal Verschoor and Andrei C. Jalba. Analysis and performance estimation of the conjugate gradient method on multiple GPUs. Parallel Computing, 38(10-11):552--575, 2012. Google ScholarDigital Library
Index Terms
- Multi-GPU/CPU deflated preconditioned conjugate gradient for bubbly flow solver
Recommendations
3D Bubbly Flow Simulation on the GPU - Iterative Solution of a Linear System Using Sub-domain and Level-Set Deflation
PDP '13: Proceedings of the 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based ProcessingSolving an ill-conditioned linear system with a two level preconditioned Conjugate Gradient method on the GPU presents many options. The viability of these options is studied for different bubbly flow problems. On the basis of experiments conducted, we ...
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance ComputingThe graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
An implementation of block conjugate gradient algorithm on CPU-GPU processors
Co-HPC '14: Proceedings of the 1st International Workshop on Hardware-Software Co-Design for High Performance ComputingIn this paper, we investigate the implementation of the Block Conjugate Gradient (BCG) algorithm on CPU-GPU processors. By analyzing the performance of various matrix operations in BCG, we identify the main performance bottleneck in constructing new ...
Comments