Abstract:Machine learning (ML) is transforming all areas of science. The complex and time-consuming calculations in molecular simulations are particularly suitable for a machine learning revolution and have already been profoundly impacted by the application of existing ML methods. Here we review recent ML methods for molecular simulation, with particular focus on (deep) neural networks for the prediction of quantum-mechanical energies and forces, coarse-grained molecular dynamics, the extraction of free energy surfaces and kinetics and generative network approaches to sample molecular equilibrium structures and compute thermodynamics. To explain these methods and illustrate open methodological problems, we review some important principles of molecular physics and describe how they can be incorporated into machine learning structures. Finally, we identify and describe a list of open challenges for the interface between ML and molecular simulation.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are the complex and time - consuming computational problems in molecular simulations, which are particularly suitable for innovation through machine learning (ML) methods. Specifically, the paper mainly focuses on the following aspects: 1. **Potential Energy Surfaces (PES)**: - **Problem**: In molecular dynamics (MD) and Markov chain Monte Carlo (MCMC) simulations, the predictive ability of classical force fields under the Born - Oppenheimer approximation depends on the accuracy of the underlying potential energy surface (PES). However, classical PES models often lack transferability and can only provide accurate results in situations close to the fitting conditions (geometric structures). - **Solution**: Use machine learning methods, especially deep neural networks, to construct models that can accurately reproduce the global potential energy surface. These models optimize parameters through energy matching or force matching, thereby improving the accuracy of prediction. 2. **Free Energy Surfaces (FES)**: - **Problem**: Calculating the free energy of a system in the collective variable space is an important problem, but the integration of high - dimensional systems is difficult to solve analytically in practice. - **Solution**: Estimate the free energy and its gradient through machine learning methods such as kernel regression and neural networks, thereby reconstructing the entire free energy surface. In addition, combined with enhanced sampling methods, the free energy surface can be learned in real - time during the simulation process. 3. **Coarse - graining**: - **Problem**: Atom - scale simulations are very expensive, especially when dealing with complex molecular systems (such as proteins). - **Solution**: Design coarse - graining models, simplify atom - scale systems into fewer effective "beads" through mapping. Use machine learning methods to define the energy function of the coarse - graining model so that it is thermodynamically consistent with the atom - scale model. 4. **Kinetics**: - **Problem**: The kinetic processes of molecules usually contain slow parts, and directly simulating these processes requires a large amount of computational resources. - **Solution**: Learn the kinetics of molecules from a given trajectory data set through machine learning methods, and construct low - dimensional kinetic propagators, thereby simplifying analysis and interpretation. 5. **Sampling and Thermodynamics**: - **Problem**: Conformational changes related to molecular functions are usually rare events, and directly simulating these events requires an extremely long time. - **Solution**: Use generative learning methods, such as variational auto - encoders (VAEs), generative adversarial networks (GANs) and flow models, to efficiently generate equilibrium samples or independent statistical samples, thereby avoiding sampling problems. 6. **Incorporating Physics into Machine Learning**: - **Problem**: How to incorporate known physical principles into machine learning models to ensure the physical meaning of prediction results. - **Solution**: Improve the robustness and prediction accuracy of the model through data augmentation and directly building physical symmetries and invariances into machine learning models. In general, this paper aims to solve the key problems in molecular simulations through machine learning methods, especially deep neural networks, and improve the efficiency and accuracy of simulations.

Machine learning for molecular simulation

Machine Learning in Molecular Dynamics Simulations of Biomolecular Systems

Machine Learning in Molecular Simulations of Biomolecules

Machine Learning for Molecular Dynamics on Long Timescales

Machine learning heralding a new development phase in molecular dynamics simulations

Advances of Machine Learning in Molecular Modeling and Simulation

Molecular Dynamics with On-the-Fly Machine Learning of Quantum-Mechanical Forces

Machine Learning of Molecular Electronic Properties in Chemical Compound Space

Machine Learning in QM/MM Molecular Dynamics Simulations of Condensed-Phase Systems

Machine learning for molecular and materials science

Machine Learning for Molecular Thermodynamics

Reactive molecular dynamics simulations and machine learning

Machine learning model for non-equilibrium structures and energies of simple molecules

Machine Learning for Performance Enhancement of Molecular Dynamics Simulations

Quantum Machine Learning for Chemistry and Physics

Perspective on integrating machine learning into computational chemistry and materials science.

Molecular Dynamics with Neural-Network Potentials

Molecular excited states through a machine learning lens

Machine learning accelerates quantum mechanics predictions of molecular crystals

Machine learning enables long time scale molecular photodynamics simulations

Machine Learning Potentials: A Roadmap Toward Next-Generation Biomolecular Simulations