Abstract:The next generation of force fields for molecular dynamics will be developed using a wealth of data. Training systematically with experimental data remains a challenge, however, especially for machine learning potentials. Differentiable molecular simulation calculates gradients of observables with respect to parameters through molecular dynamics trajectories. Here we improve this approach by explicitly calculating gradients using a reverse-time simulation with effectively constant memory cost. The method is applied to learn all-atom water and gas diffusion models with different functional forms, and to train a machine learning potential for diamond from scratch. Comparison to ensemble reweighting indicates that reversible simulation can provide more accurate gradients and train to match time-dependent observables.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to use experimental data to train classical force fields and machine - learning potential functions in molecular dynamics, especially in response to the challenges of time - dependent observables. Specifically, the paper proposes a method of reversible molecular simulation to overcome the problems of traditional methods in terms of memory consumption, computational efficiency, and gradient explosion. ### Main problems: 1. **Memory consumption**: Traditional methods such as automatic differentiation (AD) require linearly increasing memory to store intermediate states in the reverse mode, which limits the application of long - time simulations and large - scale neural networks. 2. **Computational efficiency**: The performance of automatic differentiation in the reverse mode is significantly slower than that of standard simulations because of the additional computational overhead. 3. **Gradient explosion**: Due to numerical integration, gradients are prone to explosion, especially in long - time simulations. 4. **Time - dependent observables**: For time - dependent observables (such as diffusion coefficients, autocorrelation functions, etc.), traditional ensemble re - weighting methods are not applicable. ### Solutions: The paper proposes a reversible molecular simulation method. By explicitly calculating gradients instead of using traditional automatic differentiation, the following improvements are achieved: - **Memory - efficient**: Effectively keeps the memory cost constant, only requiring the storage of a small number of coordinate and velocity snapshots. - **Computationally efficient**: The number of computations is comparable to that of standard simulations, avoiding the additional overhead brought by automatic differentiation. - **Gradient stability**: Avoids gradient explosion through gradient truncation techniques, improving the accuracy of gradients. ### Application examples: 1. **Learning of water models**: Optimized the parameters of three different forms of water models (Lennard - Jones, double - exponential, Buckingham) through reversible simulation to make them better match experimental data. 2. **Gas diffusion model**: Trained the diffusion coefficient of oxygen molecules in water to make it close to the experimental value. 3. **Neural network model of diamond**: Trained a diamond machine - learning potential function model with 121,542 parameters from scratch, demonstrating the effectiveness of this method for large - scale models. ### Summary: This paper proposes a new reversible molecular simulation method, which solves the problems of memory, efficiency, and gradient explosion encountered by traditional methods when training molecular force fields and machine - learning potential functions, and is especially suitable for the training of time - dependent observables. This method provides a powerful tool for developing the next - generation molecular dynamics force fields.

Reversible molecular simulation for training classical and machine learning force fields

Differentiable molecular simulation can learn all the parameters in a coarse-grained force field for proteins

Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields

Differentiable simulation to develop molecular dynamics force fields for disordered proteins

Molecular Dynamics with Neural-Network Potentials

Photosensitization and the nervous system in the planarian Dugesia gonocephala

Molecular Force Fields with Gradient-Domain Machine Learning: Construction and Application to Dynamics of Small Molecules with Coupled Cluster Forces

Molecular Dynamics with On-the-Fly Machine Learning of Quantum-Mechanical Forces

Differentiable Molecular Simulations for Control and Learning

Machine Learning of coarse-grained Molecular Dynamics Force Fields

Molecular Force Fields with Gradient-Domain Machine Learning (GDML): Comparison and Synergies with Classical Force Fields

On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events

Force field optimization by end-to-end differentiable atomistic simulation

Learning from the Density to Correct Total Energy and Forces in First Principle Simulations

Efficient Machine Learning Force Field for Large-Scale Molecular Simulations of Organic Systems

Machine Learning Directed Optimization of Classical Molecular Modeling Force Fields

chemtrain: Learning Deep Potential Models via Automatic Differentiation and Statistical Physics

Synthetic Force-Field Database for Training Machine Learning Models to Predict Mobility-Preserving Coarse-Grained Molecular-Simulation Potentials

Reinforcement Learning for Multi-Scale Molecular Modeling

Top-down machine learning of coarse-grained protein force-fields