Refactoring The Molecular Docking Simulation For Heterogeneous, Manycore Processors Systems
Junshi Chen,Han Lin,Weihao Liang,Yang Yu,Wenting Han,Hong An,Yong Chen,Xin Liu
DOI: https://doi.org/10.1109/ISPA/IUCC.2017.00157
2017-01-01
Abstract:This paper presents a scalable design and implementation of the molecular docking application DOCK for a large-scale high performance computing system, the Sunway TaihuLight supercomputer, which provisions a heterogeneous, manycore processor architecture that consists of management processing elements (MPEs) and clusters of computing processing elements (CPEs). The key innovation is a novel refactoring of DOCK on the CPEs. Optimization techniques for data redundancy minimization to fit data in cache, software-controlled prefetching into scratchpads, memory access coalescing, software caches, vectorization and loop unrolling are employed to improve the exploitation of the computational resources. For a single docking process, the refactored version using both the MPE and CPE cluster achieved 260x to 402x speedup compared against the original ported version using MPE only. To scale the DOCK to the full Sunway Taihulight system with 10,649,600 cores (including all MPE and CPE cores), we present an MPI communication domain partition scheme as well. For docking 9 million small compounds to a Zika virus target protein, we manage to scale to 131,072 MPEs, and 8,388,608 CPEs, with a total of 8,519,680 cores.