Abstract:Graphics processing units have been extensively used to accelerate classical molecular dynamics simulations. However, there is much less progress on the acceleration of force evaluations for many-body potentials compared to pairwise ones. In the conventional force evaluation algorithm for many-body potentials, the force, virial stress, and heat current for a given atom are accumulated within different loops, which could result in write conflict between different threads in a CUDA kernel. In this work, we provide a new force evaluation algorithm, which is based on an explicit pairwise force expression for many-body potentials derived recently [Phys. Rev. B 92 (2015) 094301]. In our algorithm, the force, virial stress, and heat current for a given atom can be accumulated within a single thread and is free of write conflicts. We discuss the formulations and algorithms and evaluate their performance. A new open-source code, GPUMD, is developed based on the proposed formulations. For the Tersoff many-body potential, the double precision performance of GPUMD using a Tesla K40 card is equivalent to that of the LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) molecular dynamics code running with about 100 CPU cores (Intel Xeon CPU X5670 @ 2.93 GHz).
Computational Physics,Materials Science,Computational Engineering, Finance, and Science
What problem does this paper attempt to address?
This paper attempts to address the problem of efficiently implementing classical molecular dynamics simulations of many-body potential functions on a Graphics Processing Unit (GPU). Specifically, the paper focuses on how to accelerate the force computation of many-body potential functions on the GPU, as the progress in accelerating the force computation of many-body potential functions on the GPU has been slower compared to two-body potential functions.
### Main Issues
1. **Write Conflicts**: In traditional many-body potential force computation algorithms, the force, stress tensor, and heat flux of the same atom are accumulated in different loops, which leads to write conflicts between different threads.
2. **Performance Bottlenecks**: Existing many-body potential force computation methods have performance bottlenecks on the GPU, especially when dealing with complex many-body potentials such as the Tersoff potential and the Stillinger-Weber potential.
### Solutions
The paper proposes a new force computation algorithm based on the recently derived explicit two-body force expressions of many-body potential functions. The features of this new algorithm include:
1. **Single-thread Accumulation**: The force, stress tensor, and heat flux of each atom can be accumulated within a single thread, avoiding write conflicts.
2. **Explicit Atomic Expressions**: The force, stress tensor, and heat flux have explicit atomic expressions, making the algorithm simple, flexible, and efficient to implement on the GPU.
### Specific Contributions
1. **Algorithm Design**: A new force computation algorithm is proposed, and its formulas and implementation methods are discussed in detail.
2. **Performance Evaluation**: An open-source code GPUMD was developed, and its performance was evaluated. The results show that for the Tersoff many-body potential, the performance of GPUMD in single precision is equivalent to the performance of LAMMPS on approximately 100 CPU cores.
3. **Generality**: The algorithm is not only applicable to the Tersoff potential but can also be extended to other many-body potential functions.
### Conclusion
By proposing a new force computation algorithm, the paper successfully addresses the problem of efficiently implementing classical molecular dynamics simulations of many-body potential functions on the GPU. The algorithm avoids write conflicts, improves computational performance, and makes large-scale molecular dynamics simulations on the GPU possible.