An Efficient Method for Optimizing PETSc on the Sunway TaihuLight System.

Letian Kang,Zhi-Jie Wang,Zhe Quan,Weigang Wu,Song Guo,Kenli Li,Keqin Li
DOI: https://doi.org/10.1109/smartworld.2018.00115
2018-01-01
Abstract:High performance computing platforms can bring us great benefits on processing various ubiquitous computing tasks. The Sunway TaihuLight supercomputer is a novel high performance computing platform, which is ranked No. 1 among the TOP500 list in the world. In this paper, we focus on how to optimize the Portable and Extensible Toolkit for Scientific computation (PETSc), running on supercomputers. The main motivations for this study are twofold: (i) PETSc is widely and frequently used in many scientific research fields such as biology, fusion, artificial intelligence, geosciences, etc; and (ii) the current nuclear PETSc does not fully utilize the potential of the Sunway TaighLight system, especially its powerful processor, i.e., SW26010 processor. To achieve high efficiency of PETSc, the central idea of our optimizations is to fully promote the performance of time-consuming and frequently used computation components (e.g., matrix and vector modules). To this end, we propose (i) accelerating kernel codes with computing processing elements (CPEs), in which new compression format and targeted optimizations for vector and matrix operations are devised; and (ii) using more efficient memory access schemes. We have implemented our proposals and evaluated its effectiveness and efficiency through a real world application - Structural Finite Element Analysis (SFEA). We obtain 16~32 times speedup for a single SW26010 processor. As an extra finding, the results also show a high scalability on over 8,000 computing nodes, i.e., 532,500 cores.
What problem does this paper attempt to address?