Solving unsymmetric sparse systems of linear equations with PARDISO

Olaf Schenk,Klaus Gärtner
DOI: https://doi.org/10.1016/j.future.2003.07.011
IF: 7.307
2004-04-01
Future Generation Computer Systems
Abstract:Supernode partitioning for unsymmetric matrices together with complete block diagonal supernode pivoting and asynchronous computation can achieve high gigaflop rates for parallel sparse LU factorization on shared memory parallel computers. The progress in weighted graph matching algorithms helps to extend these concepts further and unsymmetric prepermutation of rows is used to place large matrix entries on the diagonal. Complete block diagonal supernode pivoting allows dynamical interchanges of columns and rows during the factorization process. The level-3 BLAS efficiency is retained and an advanced two-level left–right looking scheduling scheme results in good speedup on SMP machines. These algorithms have been integrated into the recent unsymmetric version of the PARDISO solver. Experiments demonstrate that a wide set of unsymmetric linear systems can be solved and high performance is consistently achieved for large sparse unsymmetric matrices from real world applications.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the efficiency and robustness of solving asymmetric sparse linear equations, especially to achieve efficient parallel sparse LU decomposition on shared - memory multiprocessor architectures. Specifically, the article focuses on the following aspects: 1. **Efficient parallel solution of asymmetric sparse matrices**: The authors propose a new method to deal with the parallel LU decomposition problem of asymmetric sparse matrices, aiming to improve computational performance and reduce the change in dependency relationships caused by partial pivoting. 2. **Improved scalability and robustness**: In order to achieve better scalability and robustness on shared - memory multiprocessor architectures, the article explores how to reduce the need for partial pivoting through complete block diagonal supernode pivoting, thereby improving the stability and efficiency of the algorithm. 3. **Static calculation of task - dependency graphs**: By introducing a method that combines asymmetric row permutations with complete block diagonal supernode pivoting, the task - dependency graph can be calculated under static conditions, thereby simplifying the synchronization requirements in the parallelization process. 4. **Implementation of high - performance computing**: The article describes how to maintain efficiency by using Level - 3 BLAS functions and adopts a two - level left - right looking scheduling scheme to achieve a good speedup ratio. In summary, the core objective of this paper is to develop a parallel direct solver that can solve large - scale asymmetric sparse linear systems efficiently and reliably, and verify its superior performance in practical applications through experiments.