Hybrid Classical/Machine-Learning Force Fields for the Accurate Description of Molecular Condensed-Phase Systems

Moritz Thürlemann,Sereina Riniker
DOI: https://doi.org/10.1039/D3SC04317G
2023-08-17
Abstract:Electronic structure methods offer in principle accurate predictions of molecular properties, however, their applicability is limited by computational costs. Empirical methods are cheaper, but come with inherent approximations and are dependent on the quality and quantity of training data. The rise of machine learning (ML) force fields (FFs) exacerbates limitations related to training data even further, especially for condensed-phase systems for which the generation of large and high-quality training datasets is difficult. Here, we propose a hybrid ML/classical FF model that is parametrized exclusively on high-quality ab initio data of dimers and monomers in vacuum but is transferable to condensed-phase systems. The proposed hybrid model combines our previous ML-parametrized classical model with ML corrections for situations where classical approximations break down, thus combining the robustness and efficiency of classical FFs with the flexibility of ML. Extensive validation on benchmarking datasets and experimental condensed-phase data, including organic liquids and small-molecule crystal structures, showcases how the proposed approach may promote FF development and unlock the full potential of classical FFs.
Chemical Physics
What problem does this paper attempt to address?
The paper aims to address the accurate description of atomic interactions in condensed phase systems. Although current electronic structure methods theoretically predict molecular properties accurately, their application is limited due to computational costs. On the other hand, empirical methods are cheaper but suffer from inherent approximations and reliance on high-quality training data. With the development of machine learning (ML) force fields, particularly for condensed phase systems that require large amounts of high-quality training data, this challenge becomes more prominent. The paper proposes a hybrid ML/classical force field model that is parameterized based only on high-precision ab initio data of dimers and monomers in vacuum but can be transferable to condensed phase systems. This hybrid model combines the robustness and efficiency of classical force fields with the flexibility of ML. Through extensive validation on benchmark test data and experimental condensed phase data, including organic liquids and small molecule crystal structures, the paper demonstrates that this approach can promote the development of force fields and unleash the full potential of classical force fields. The paper mainly focuses on the transferability from small isolated systems to large condensed phase systems, which is a key attribute often overlooked when assessing the potential of ML. The model utilizes existing classical models to describe possible atomic interactions, while using ML to parameterize these classical models and perform replacements and corrections when the classical description fails, such as in short-range and large overlap situations. The research methods include molecular graph neural networks, geometric graphs, message passing graph neural networks, etc., used to extract features and predict atomic multipole moments, polarization, and short-range interactions. The model is trained by input features of molecular graphs and geometric graphs to decompose and explain various interaction terms, such as electrostatics, polarization, and dispersion forces, and introduces ML corrections to deal with classical approximations failures. In this way, the model is able to achieve the transfer from gas phase to condensed phase while maintaining a certain level of interpretability.