Abstract:Machine Learning (ML)-based force fields are attracting ever-increasing interest due to their capacity to span spatiotemporal scales of classical interatomic potentials at quantum-level accuracy. They can be trained based on high-fidelity simulations or experiments, the former being the common case. However, both approaches are impaired by scarce and erroneous data resulting in models that either do not agree with well-known experimental observations or are under-constrained and only reproduce some properties. Here we leverage both Density Functional Theory (DFT) calculations and experimentally measured mechanical properties and lattice parameters to train an ML potential of titanium. We demonstrate that the fused data learning strategy can concurrently satisfy all target objectives, thus resulting in a molecular model of higher accuracy compared to the models trained with a single data source. The inaccuracies of DFT functionals at target experimental properties were corrected, while the investigated off-target properties remained largely unperturbed. Our approach is applicable to any material and can serve as a general strategy to obtain highly accurate ML potentials.

What problem does this paper attempt to address?

This paper mainly discusses how to improve the accuracy of machine learning (ML) force fields by integrating experimental and simulated data. Traditional force field training is usually based on high-fidelity simulation or experimental data, but these methods are affected by data scarcity and errors, leading to models that may be inconsistent with experimental observations or under-constrained. The researchers combined the mechanical properties and lattice parameters of titanium calculated by density functional theory (DFT) and experimental measurements to train an ML potential model. They proposed a data fusion learning strategy that can simultaneously meet all objectives, resulting in a more accurate molecular model than models trained with a single data source. This approach corrects the inaccuracies of DFT calculations in certain experimental properties, while having a smaller impact on non-target properties. The paper also emphasizes the importance of the size of the training dataset, system scale, and long-range interactions. In their research, they used a graph neural network (GNN) potential model and iteratively applied DFT and experimental trainers to learn from both simulated and experimental data. The results show that this fusion training method can improve the accuracy of the model, especially in the mechanical properties and lattice parameters of titanium, while having a smaller impact on other non-target properties. In addition, the paper demonstrates the influence of the amount of experimental data and temperature transferability on the model performance through data exclusion experiments, indicating that increasing diverse experimental data is more beneficial than densely sampling a single property. Overall, this paper addresses how to construct more accurate machine learning force field models by integrating experimental and simulated data to improve the predictive accuracy of molecular dynamics simulations in materials science.

Accurate machine learning force fields via experimental and simulation data fusion

Accurate machine learning force fields via experimental and simulation data fusion

Accurate interatomic force field for molecular dynamics simulation by hybridizing classical and machine learning potentials

BIGDML: Towards Exact Machine Learning Force Fields for Materials

Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields

Liquid to Crystal Si Growth Simulation Using Machine Learning Force Field

BIGDML—Towards accurate quantum machine learning force fields for materials

Construction of accurate machine learning force fields for copper and silicon dioxide

Complexity of many-body interactions in transition metals via machine-learned force fields from the TM23 data set

Efficient Machine Learning Force Field for Large-Scale Molecular Simulations of Organic Systems

On the design space between molecular mechanics and machine learning force fields

Hybrid Classical/Machine-Learning Force Fields for the Accurate Description of Molecular Condensed-Phase Systems

Improving machine learning force fields for molecular dynamics simulations with fine-grained force metrics

Machine Learning of Accurate Energy-Conserving Molecular Force Fields

Machine Learning Force Fields: Construction, Validation, and Outlook

Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations

Molecular Dynamics with On-the-Fly Machine Learning of Quantum-Mechanical Forces

Accurate global machine learning force fields for molecules with hundreds of atoms

Indirect Learning of Interatomic Potentials for Accelerated Materials Simulations

Putting Density Functional Theory to the Test in Machine-Learning-Accelerated Materials Discovery