Breaking the Molecular Dynamics Timescale Barrier Using a Wafer-Scale System

Kylee Santos,Stan Moore,Tomas Oppelstrup,Amirali Sharifian,Ilya Sharapov,Aidan Thompson,Delyan Z Kalchev,Danny Perez,Robert Schreiber,Scott Pakin,Edgar A Leon,James H Laros III,Michael James,Sivasankaran Rajamanickam
2024-05-14
Abstract:Molecular dynamics (MD) simulations have transformed our understanding of the nanoscale, driving breakthroughs in materials science, computational chemistry, and several other fields, including biophysics and drug design. Even on exascale supercomputers, however, runtimes are excessive for systems and timescales of scientific interest. Here, we demonstrate strong scaling of MD simulations on the Cerebras Wafer-Scale Engine. By dedicating a processor core for each simulated atom, we demonstrate a 179-fold improvement in timesteps per second versus the Frontier GPU-based Exascale platform, along with a large improvement in timesteps per unit energy. Reducing every year of runtime to two days unlocks currently inaccessible timescales of slow microstructure transformation processes that are critical for understanding material behavior and function. Our dataflow algorithm runs Embedded Atom Method (EAM) simulations at rates over 270,000 timesteps per second for problems with up to 800k atoms. This demonstrated performance is unprecedented for general-purpose processing cores.
Computational Physics,Distributed, Parallel, and Cluster Computing,Emerging Technologies
What problem does this paper attempt to address?
This paper primarily discusses how to overcome the time scale limitation in molecular dynamics (MD) simulations. In current distributed memory parallel systems, although weak scaling can handle larger-scale systems, it cannot overcome the limitation of time scale. This limitation is mainly caused by the discrepancy between computing speed and communication bandwidth and latency between nodes. MD simulations are crucial in fields such as material science, chemistry, and physics, but they are limited by the time scale, for example, requiring femtosecond time steps to simulate atomic vibrations and microsecond time scales to simulate important physical phenomena such as material aging and catalytic reactions. To break this time scale barrier, the paper proposes the use of a highly parallel system where the communication bandwidth matches the computing throughput and the communication latency matches the core clock frequency. The Cerebras Wafer-Scale Engine (WSE) mentioned in the paper is a single silicon wafer with nearly a million processors and a high-performance communication and memory system, which is expected to support efficient strong scaling and significantly increase the number of time steps per second. By implementing the strong scaling approach for MD simulations on the WSE, the paper demonstrates acceleration in simulating physically relevant systems, significantly expanding the time scale range of direct MD simulations. The specific contributions include: 1. Efficient strong scaling of MD simulations on the Cerebras Wafer-Scale Engine, increasing the accessible MD time scale by more than 100 times. 2. Proposing a simple but accurate performance model with performance error within 3% of actual implementation. 3. Demonstrating simulations on 800,000-atom metal crystal lattice with 274,000 time steps per second, which is 179 times faster than state-of-the-art GPU-based platforms. The paper also discusses the current state of MD simulations, performance bottlenecks on GPU and CPU platforms, and proposes a locality-preserving atom mapping method implemented on the WSE to optimize data exchange and computational efficiency. Additionally, the paper describes the algorithm and technical details of implementing these innovations, including neighbor lists, atom exchange, and handling of periodic boundary conditions. In summary, the goal of the paper is to greatly expand the potential time scale for studying material behavior and functionality by improving MD simulation techniques, enabling simulations that previously took months to complete to now be finished in just a few days.