swPTS: an efficient parallel Thomas split algorithm for tridiagonal systems on Sunway manycore processors
Min Tian,Qi Liu,Jingshan Pan,Ying Gou,Zanjun Zhang
DOI: https://doi.org/10.1007/s11227-023-05641-1
IF: 3.3
2023-09-19
The Journal of Supercomputing
Abstract:Tridiagonal system solver is a basic kernel and has been well-supported in mainstream numerical libraries. The purpose of this paper is to devise an efficient parallel algorithm to solve a large-scale tridiagonal system. Based on the performance analysis of the classic Thomas algorithm and matrix splitting method, we propose a parallel Thomas split (PTS) algorithm. Compared with the matrix splitting method, the PTS algorithm can achieve an acceleration of 10.34×$$\times $$. Furthermore, we propose a Sunway parallel Thomas split (swPTS) algorithm based on the sw26010pro manycore processor. In the swPTS algorithm, we propose a specific data partitioning scheme to implement MPI+Athread parallelism. In the reduced set of equations, a new reduction approach for the Sunway architecture is proposed. Experiments show that the parallel elimination stage of our swPTS algorithm achieves up to 38.31×$$\times $$ speedup over a PTS algorithm, and overall reaches 5.74×$$\times $$ speedup over a Thomas algorithm.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture