Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics

Marloes Arts,Victor Garcia Satorras,Chin-Wei Huang,Daniel Zuegner,Marco Federici,Cecilia Clementi,Frank Noé,Robert Pinsler,Rianne van den Berg
2023-09-22
Abstract:Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields and molecular dynamics to learn a CG force field without requiring any force inputs during training. Specifically, we train a diffusion generative model on protein structures from molecular dynamics simulations, and we show that its score function approximates a force field that can directly be used to simulate CG molecular dynamics. While having a vastly simplified training setup compared to previous work, we demonstrate that our approach leads to improved performance across several small- to medium-sized protein simulations, reproducing the CG equilibrium distribution, and preserving dynamics of all-atom simulations such as protein folding events.
Machine Learning
What problem does this paper attempt to address?
The paper aims to address a key challenge in Coarse-Grained (CG) molecular dynamics simulations: how to effectively learn a CG force field so that CG molecular dynamics simulations can reproduce important features of atomistic simulations, especially in achieving the study of biological processes on time and spatial scales. To solve this problem, the authors leverage the connection between Score-Based Generative Models, force fields, and molecular dynamics, proposing a new method to learn CG force fields without requiring any force information during the training process. Specifically, they achieve this goal through the following steps: 1. **Model Training**: The authors use a diffusion generative model to train on protein structures obtained from molecular dynamics simulations. 2. **Force Field Extraction**: By demonstrating the connection between score-based generative models, force fields, and molecular dynamics, the authors show that the score function of the generative model can approximate a force field, which can be directly used for CG molecular dynamics simulations. 3. **Performance Validation**: Compared to previous work, this method greatly simplifies the training setup and demonstrates improved performance in multiple protein simulations, including systems with up to 56 amino acids, being able to reproduce CG equilibrium distributions and maintain the dynamical mechanisms of all-atom simulations such as protein folding. This method not only provides a simple single-stage training framework but also shows excellent performance in several protein simulations, particularly demonstrating potential advantages in simulating larger proteins. Additionally, this method overcomes some limitations of traditional methods, such as low data efficiency, high computational cost, and insufficient model accuracy.