Hardware acceleration of tensor-structured multilevel ewald summation method on MDGRAPE-4A, a special-purpose computer system for molecular dynamics simulations

Gentaro Morimoto,Yohei M. Koyama,Hao Zhang,Teruhisa S. Komatsu,Yousuke Ohno,Keigo Nishida,Itta Ohmura,Hiroshi Koyama,Makoto Taiji
DOI: https://doi.org/10.1145/3458817.3476190
2021-11-13
Abstract:We developed MDGRAPE-4A, a special-purpose computer system for molecular dynamics simulations, consisting of 512 nodes of custom system-on-a-chip LSIs with dedicated processor cores and interconnects designed to achieve strong scalability for biomolecular simulations. To reduce the global communications required for the evaluation of Coulomb interactions, we conducted a co-design of the MDGRAPE-4A and the novel algorithm, tensor-structured multilevel Ewald summation method (TME), which produced hardware modules on the custom LSI circuit for particle-grid operations and for grid-grid separable convolutions on a 3D torus network. We implemented the convolution for the top-level grid potentials by using 3D FFTs on an FPGA, along with an FPGA-based octree network to gather grid charges. The elapsed time for the long-range part of Coulomb is 50 $mumathrm{s}$, which can mostly overlap with those for the short-range part, and the additional cost is approximately 10 $mumathrm{s}/ ext{step}$, which is only a 5% performance loss.
What problem does this paper attempt to address?