SW_GROMACS: accelerate GROMACS on Sunway TaihuLight

Tingjian Zhang,Yuxuan Li,Ping Gao,Qi Shao,Mingshan Shao,Meng Zhang,Jinxiao Zhang,Xiaohui Duan,Zhao Liu,Lin Gan,Haohuan Fu,Wei Xue,Weiguo Liu,Guangwen Yang
DOI: https://doi.org/10.1145/3295500.3356190
2019-01-01
Abstract:GROMACS is one of the most popular Molecular Dynamic (MD) applications and is widely used in the field of chemical and bimolecular system study. Similar to other MD applications, it needs long run-time for large-scale simulations. Therefore, many high performance platforms have been employed to accelerate it, such as Knights Landing (KNL), Cell Processor, Graphics Processing Unit (GPU) and so on. As the third fastest supercomputer in the world, Sunway TaihuLight contains 40960 SW26010 processors and SW26010 is a typical many-core processor. To make full use of the superior computation ability of TaihuLight, we port GROMACS to SW26010 with following new strategies: (1) a new deferred update strategy; (2) a new update mark strategy; (3) a full pipeline acceleration. Furthermore, we redesign GROMACS to enable all possible vectorization. Experiments show that our implementation achieves better performance than both Intel KNL and Nvidia P100 GPU when using appropriate number of SW26010 processors for a fair comparison.
What problem does this paper attempt to address?