Abstract:The next generation of force fields for molecular dynamics will be developed using a wealth of data. Training systematically with experimental data remains a challenge, however, especially for machine learning potentials. Differentiable molecular simulation calculates gradients of observables with respect to parameters through molecular dynamics trajectories. Here we improve this approach by explicitly calculating gradients using a reverse-time simulation with effectively constant memory cost. The method is applied to learn all-atom water and gas diffusion models with different functional forms, and to train a machine learning potential for diamond from scratch. Comparison to ensemble reweighting indicates that reversible simulation can provide more accurate gradients and train to match time-dependent observables.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to use experimental data to train classical force fields and machine - learning potential functions in molecular dynamics, especially in response to the challenges of time - dependent observables. Specifically, the paper proposes a method of reversible molecular simulation to overcome the problems of traditional methods in terms of memory consumption, computational efficiency, and gradient explosion.
### Main problems:
1. **Memory consumption**: Traditional methods such as automatic differentiation (AD) require linearly increasing memory to store intermediate states in the reverse mode, which limits the application of long - time simulations and large - scale neural networks.
2. **Computational efficiency**: The performance of automatic differentiation in the reverse mode is significantly slower than that of standard simulations because of the additional computational overhead.
3. **Gradient explosion**: Due to numerical integration, gradients are prone to explosion, especially in long - time simulations.
4. **Time - dependent observables**: For time - dependent observables (such as diffusion coefficients, autocorrelation functions, etc.), traditional ensemble re - weighting methods are not applicable.
### Solutions:
The paper proposes a reversible molecular simulation method. By explicitly calculating gradients instead of using traditional automatic differentiation, the following improvements are achieved:
- **Memory - efficient**: Effectively keeps the memory cost constant, only requiring the storage of a small number of coordinate and velocity snapshots.
- **Computationally efficient**: The number of computations is comparable to that of standard simulations, avoiding the additional overhead brought by automatic differentiation.
- **Gradient stability**: Avoids gradient explosion through gradient truncation techniques, improving the accuracy of gradients.
### Application examples:
1. **Learning of water models**: Optimized the parameters of three different forms of water models (Lennard - Jones, double - exponential, Buckingham) through reversible simulation to make them better match experimental data.
2. **Gas diffusion model**: Trained the diffusion coefficient of oxygen molecules in water to make it close to the experimental value.
3. **Neural network model of diamond**: Trained a diamond machine - learning potential function model with 121,542 parameters from scratch, demonstrating the effectiveness of this method for large - scale models.
### Summary:
This paper proposes a new reversible molecular simulation method, which solves the problems of memory, efficiency, and gradient explosion encountered by traditional methods when training molecular force fields and machine - learning potential functions, and is especially suitable for the training of time - dependent observables. This method provides a powerful tool for developing the next - generation molecular dynamics force fields.