Abstract:In this paper, we propose an efficient parallel dynamic linear solver, called GPU-GMRES, for transient analysis of large linear dynamic systems such as large power grid networks. The new method is based on the preconditioned generalized minimum residual (GMRES) iterative method implemented on heterogeneous CPU–GPU platforms. The new solver is very robust and can be applied to power grids with different structures as well as for general analysis problems for large linear dynamic systems with asymmetric matrices. The proposed GPU-GMRES solver adopts the very general and robust incomplete LU based preconditioner. We show that by properly selecting the right amount of fill-ins in the incomplete LU factors, a good trade-off between GPU efficiency and convergence rate can be achieved for the best overall performance. Such tunable feature can make this algorithm very adaptive to different problems. GPU-GMRES solver properly partitions the major computing tasks in GMRES solver to minimize the data traffic between CPU and GPUs to enhance performance of the proposed method. Furthermore, we propose a new fast parallel sparse matrix–vector (SpMV) multiplication algorithm to further accelerate the GPU-GMRES solver. The new algorithm, called segSpMV, can enjoy full coalesced memory access compared to existing approaches. To further improve the scalability and efficiency, segSpMV method is further extended to multi-GPU platforms, which leads to more scalable and faster multi-GPU GMRES solver. Experimental results on the set of the published IBM benchmark circuits and mesh-structured power grid networks show that the GPU-GMRES solver can deliver order of magnitudes speedup over the direct LU solver, UMFPACK. The resulting multi-GPU-GMRES can also deliver 3–12×speedup over the CPU implementation of the same GMRES method on transient analysis.

Higher Order Method of Moments With a Parallel Out-of-Core LU Solver on GPU/CPU Platform

An Efficient Gpu-Based Out-Of-Core Lu Solver of Parallel Higher-Order Method of Moments for Solving Airborne Array Problems

Parallel Higher-Order Method of Moments with Efficient Out-of-GPU Memory Schemes for Solving Electromagnetic Problems

A Highly Efficient GPU-CPU Hybrid Parallel Implementation of Sparse LU Factorization

A New Hybrid GPU-CPU Sparse LDLT Factorization Algorithm with GPU and CPU Factorizing Concurrently

A New Decomposition Solver for Complex Electromagnetic Problems [EM Programmer's Notebook]

A Shared Memory-Based Parallel Out-of-core LU Solver for Matrix Equations with Application in EM Problems

Parallel Triangular Solvers on GPU

A New Hybrid GPU-CPU Sparse LDL T Factorization Algorithm with GPU and CPU Factorizing Concurrently

Computing Low-Rank Approximation of a Dense Matrix on Multicore CPUs with a GPU and Its Application to Solving a Hierarchically Semiseparable Linear System of Equations

On Parallel Stiff ODEs Solver for Hybrid CPU-GPU Architecture

PARALLEL MOM-PO METHOD WITH OUT-OF-CORE TECHNIQUE FOR ANALYSIS OF COMPLEX ARRAYS ON ELECTRICALLY LARGE PLATFORMS

An Efficient Matrix Equation Parallel Direct Solver for Higher-Order Method of Moments in Solution of Complex Electromagnetic Problems.

Sparse matrix LU decomposition method based on GPU

A Hybrid CPU-GPU Multifrontal Optimizing Method in Sparse Cholesky Factorization

Towards Optimal Fast Matrix Multiplication on CPU-GPU Platforms

Parallel GMRES solver for fast analysis of large linear dynamic systems on GPU platforms

Real-time 3-D image analysis via Jacobi moments

GPU Based Two-Level CMFD Accelerating Two-Dimensional MOC Neutron Transport Calculation

Parallel In-Core and Out-of-core Solution of Electrically Large Problems Using the RWG Basis Functions

Parallel singular value decomposition on heterogeneous multi-core and multi-GPU platforms