Machine-learned molecular mechanics force fields from large-scale quantum chemical data

Kenichiro Takaba,Anika Friedman,Chapin Cavender,Pavan Behara,Iván Pulido,Mike Henry,Hugo MacDermott-Opeskin,Christopher Iacovella,Arnav Nagle,Alexander Payne,Michael Shirts,David L. Mobley,John D. Chodera,Yuanqing Wang
DOI: https://doi.org/10.1039/d4sc00690a
IF: 8.4
2024-06-27
Chemical Science
Abstract:The development of reliable and extensible molecular mechanics (MM) forcefields—fast, empirical models characterizing the potential energy surface of molecular systems—is indispensable for biomolecular simulation and computer-aided drug design. Here, we introduce a generalized and extensible machine-learned MM force field, espaloma-0.3, and an end-to-end differentiable framework using graph neural networks to overcome the limitations of traditional rule-based methods. Trained in a single GPU-day to fit a large and diverse quantum chemical dataset of over 1.1M energy and force calculations, espaloma-0.3 reproduces quantum chemical energetic properties of chemical domains highly relevant to drug discovery, including small molecules, peptides, and nucleic acids. Moreover, this force field maintains the quantum chemical energy-minimized geometries of small molecules and preserves the condensed phase properties of peptides and folded proteins, self-consistently parametrizing proteins and ligands to produce stable simulations leading to highly accurate predictions of binding free energies. This methodology demonstrates significant promise as a path forward for systematically building more accurate force fields that are easily extensible to new chemical domains of interest.
chemistry, multidisciplinary
What problem does this paper attempt to address?
The paper presents a solution to the limitations of traditional molecular mechanics (MM) force field modeling methods. Traditional force fields rely on expert knowledge and rule libraries, which restrict their accuracy in complex chemical environments and make it difficult to extend to new chemical areas. The paper introduces a machine learning-based and scalable MM force field called espaloma-0.3, which uses graph neural networks to overcome the limitations of rule-based methods. By training over 1.1 million quantum chemistry data points within a single GPU day, espaloma-0.3 can accurately simulate the quantum chemistry properties of small molecules, peptides, and nucleic acids in chemical space. Furthermore, it is able to maintain the quantum chemistry energy-minimized geometry of small molecules and preserve the condensed phase properties of peptides and folded proteins, enabling stable simulations and accurate prediction of protein-ligand binding free energy. This approach provides a pathway for constructing more accurate and scalable force fields, potentially simplifying and automating the force field development process.