Accelerating Sparse Cholesky Factorization on Sunway Manycore Architecture.

Mingzhen Li,Yi Liu,Hailong Yang,Zhongzhi Luan,Lin Gan,Guangwen Yang,Depei Qian
DOI: https://doi.org/10.1109/tpds.2019.2953852
IF: 5.3
2020-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:To improve the performance of sparse Cholesky factorization, existing research divides the adjacent columns of the sparse matrix with the same nonzero patterns into supernodes for parallelization. However, due to the various structures of sparse matrices, the computation of the generated supernodes varies significantly, and thus hard to optimize when computed by dense matrix kernels. Therefore, how to efficiently map sparse Choleksy factorization to the emerging architectures, such as Sunway many-core processor, remains an active research direction. In this article, we propose swCholesky, which is a highly optimized implementation of sparse Cholesky factorization on Sunway processor. Specifically, we design three kernel task queues and a dense matrix library to dynamically adapt to the kernel characteristics and architecture features. In addition, we propose an auto-tuning mechanism to search for the optimal settings of the important parameters in swCholesky. Our experiments show that swCholesky achieves better performance than state-of-the-art implementations.
What problem does this paper attempt to address?