Abstract:Machine learning has recently entered into the mainstream of coarse-grained (CG) molecular modeling and simulation. While a variety of methods for incorporating deep learning into these models exist, many of them involve training neural networks to act directly as the CG force field. This has several benefits, the most significant of which is accuracy. Neural networks can inherently incorporate multi-body effects during the calculation of CG forces, and a well-trained neural network force field outperforms pairwise basis sets generated from essentially any methodology. However, this comes at a significant cost. First, these models are typically slower than pairwise force fields even when accounting for specialized hardware which accelerates the training and integration of such networks. The second, and the focus of this paper, is the need for the considerable amount of data needed to train such force fields. It is common to use 10s of microseconds of molecular dynamics data to train a single CG model, which approaches the point of eliminating the CG models usefulness in the first place. As we investigate in this work, this data-hunger trap from neural networks for predicting molecular energies and forces can be remediated in part by incorporating equivariant convolutional operations. We demonstrate that for CG water, networks which incorporate equivariant convolutional operations can produce functional models using datasets as small as a single frame of reference data, while networks without these operations cannot.

What problem does this paper attempt to address?

The paper aims to address the "data hunger" issue encountered when applying neural networks in Coarse-Grained (CG) molecular simulations. Specifically, the paper focuses on the following points: 1. **Data Efficiency Issue**: Traditional neural network-based CG force fields require a large amount of training data (usually tens of microseconds of molecular dynamics data), which significantly reduces the advantages of CG models in practical applications. 2. **Model Stability and Accuracy**: By introducing equivariant convolutional operations, the paper attempts to improve the data efficiency of CG models and ensure stable model performance even with less training data. 3. **Comparison of Different Architectures**: The paper compares two different neural network architectures—DeePMD (symmetry-invariant) and Allegro (equivariant). The results show that equivariant models (such as Allegro) can generate stable and accurate CG water models even with very limited training data. In summary, the main goal of the paper is to improve the data efficiency of neural network-based CG models by introducing equivariance, thereby maintaining good performance and stability with less data.

Coarse-Graining with Equivariant Neural Networks: A Path Towards Accurate and Data-Efficient Models

Coarse Graining Molecular Dynamics with Graph Neural Networks

Transferable Coarse Graining Via Contrastive Learning of Graph Neural Networks

Machine Learning of coarse-grained Molecular Dynamics Force Fields

A coarse-grained deep neural network model for liquid water

Temperature-transferable coarse-graining of ionic liquids with dual graph convolutional neural networks

Ensemble learning of coarse-grained molecular dynamics force fields with a kernel approach.

Generative Coarse-Graining of Molecular Conformations

Learning data efficient coarse-grained molecular dynamics from forces and noise

A Scalable Graph Neural Network Method For Developing An Accurate Force Field Of Large Flexible Organic Molecules

Deep learning for variational multiscale molecular modeling

Multi-body Effects in a Coarse-Grained Protein Force Field.

Thermodynamic Transferability in Coarse-Grained Force Fields using Graph Neural Networks

A Data-Driven Approach to Coarse-Graining Simple Liquids in Confinement

Neural Network-Assisted Model of Interfacial Fluids with Explicit Coarse-Grained Molecular Structures.

Simulate Time-integrated Coarse-grained Molecular Dynamics with Geometric Machine Learning

DiffGLE: Differentiable Coarse-Grained Dynamics using Generalized Langevin Equation

Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics

Statistically Optimal Force Aggregation for Coarse-Graining Molecular Dynamics

Invertible Coarse Graining with Physics-Informed Generative Artificial Intelligence

High Accuracy Uncertainty-Aware Interatomic Force Modeling with Equivariant Bayesian Neural Networks