Studying performance portability of LAMMPS across diverse GPU‐based platforms
Nick Hagerty,Verónica G. Melesse Vergara,Arnold Tharrington
DOI: https://doi.org/10.1002/cpe.7895
2023-09-25
Concurrency and Computation Practice and Experience
Abstract:Summary The molecular dynamics simulation software, LAMMPS, utilizes the Kokkos acceleration library to port computation to a diverse set of architectures including those based on GPU accelerators. In addition to Kokkos, LAMMPS contains a vast code base that leverages the CUDA application programming interface using library functions such as cuFFT, CUDA's fast‐fourier transform (FFT) library, and, more recently, also support for AMD's Heterogeneous Interface for Portability (HIP) that is rapidly growing. While preparing LAMMPS tests for the AMD GPU‐based test system precursors to Frontier, we investigated several strategies for accelerating LAMMPS on AMD GPUs, using the AMD Instinct MI100 and MI250X. In this work, we integrated the HIP FFT library, hipFFT, into the particle‐particle particle‐mesh (PPPM) long‐range solver, which allowed the porting of PPPM calculations to the GPUs. Kokkos behavior on the MI100 and MI250X was also investigated through the package kokkos command of LAMMPS, targeting communication, memory usage, and particle grid decomposition. The Tersoff, Reax, Lennard‐Jones (LJ), EAM, Granular, and PPPM potentials were investigated in this effort, and results from these experiments are provided. The selected potentials were run on Spock (AMD Instinct MI100), Crusher (AMD Instinct MI250X), AFW HPC11 (NVIDIA A100) and Summit (NVIDIA V100), for comparison. Operational roofline models were constructed and analyzed for the Tersoff, Reax, and Lennard–Jones potentials on Crusher and Summit.
computer science, theory & methods, software engineering