PARALLEL SOLUTION OF FINE-MESH NEUTRON DIFFUSION EQUATION ON HETEROGENEOUS STRUCTURE OF SUNWAY TAIHULIGHT SUPERCOMPUTER

Feiyu Chen,Ganglin Yu,Shifei Shen
DOI: https://doi.org/10.1299/jsmeicone.2019.27.2011
2019-01-01
Abstract:Quick and accurate solution of the multi-group neutron diffusion equation is very important in the numerical calculation of reactor physics. Among the numerical methods, finite difference method is a simple method to get accurate results and is easy to program. Fine mesh is required in the method, which will lead to long time consumption for computing. Parallel computing on parallel platforms such as supercomputers is an effective approach to reduce the computing time. Advanced supercomputers like Sunway TaihuLight have heterogeneous structures. Similar structure will be widely used by the E-level supercomputers in the future. However, parallel programming on heterogeneous structures is more complex and there is little software research on reactor physics on Sunway TaihuLight at present. In this paper, a parallel program for solving the fine-mesh neutron diffusion equation based on finite difference method on Sunway TaihuLight was finished. A two-level parallel mode is used in the program: process-level parallelization between core groups by Message Passing Interface (MPI) and thread-level parallelization in each single core group. During the thread-level parallelization, two parallel programming interfaces specially designed for Sunway TaihuLight, OpenAcc* , a high-level method, and Athread, a low-level method, were tried respectively for comparison. The IAEA static 3-D PWR benchmark problem in 2-group case was used to verify the program’s results and test the parallel performance relative to the serial performance of Sunway processor. The calculation results are proved to be correct in comparison with reference results. It is showed that Athread has better performance than OpenAcc* for the complicated iterations in finite difference method. For the mesh scale of 170×170×380, the speedup ratio is 12.907 on a single core group with Athread. For the process-level parallelization, the program’s speedup ratio can reach 201.322 on 64 core groups at present. The efficiency of Computing Processing Elements (CPEs) is found to decrease with the increase of CPEs. The program is proved to be highly parallelizable and performance-stable when higher accuracy is required.
What problem does this paper attempt to address?