Transferable Machine Learning Interatomic Potential for Bond Dissociation Energy Prediction of Drug-like Molecules

Elena Gelžinytė,Mario Öeren,Matthew D. Segall,Gábor Csányi
DOI: https://doi.org/10.26434/chemrxiv-2023-l85nf-v2
2023-12-13
Abstract:We present a transferable MACE interatomic potential that is applicable to open- and closed-shell drug-like molecules containing hydrogen, carbon, and oxygen atoms. Including an accurate description of radical species extends the scope of possible applications to bond dissociation energy prediction, for example, in the context of cytochrome P450 (CYP) metabolism. The transferability of the MACE potential was validated on the COMP6 dataset, containing only closed-shell molecules, where it reaches better accuracy than the readily available general ANI-2x potential. MACE achieves similar accuracy on two CYP metabolism-specific datasets, which include open- and closed-shell structures. This model enables us to calculate the aliphatic C-H bond dissociation energy (BDE), which allows us to compare reaction energies of hydrogen abstraction, which is the rate-limiting step of the aliphatic hydroxylation reaction catalysed by CYPs. On the “CYP 3A4” dataset, MACE achieves a BDE RMSE of 1.37 kcal/mol and better prediction of BDE ranks than alternatives - the semi-empirical AM1 and GFN2-xTB methods and the ALFABET model by St. John et al.1 that predicts bond dissociation enthalpies. Finally, we highlight the smoothness of the MACE potential over paths of sp3C-H bond elongation and show that a minimal extension is enough for the MACE model to start finding reasonable minimum energy paths of methoxy radical-mediated hydrogen abstraction. Altogether, this work lays the ground for further extensions of scope in terms of chemical elements, (CYP-mediated) reaction classes and modelling the full reaction paths, not only bond dissociation energies.
Chemistry
What problem does this paper attempt to address?
The problem addressed in this paper is how to develop a machine learning interatomic potential (interatomic potential) that accurately predicts the bond dissociation energy of drug molecules, especially those involving open-shell species reactions, such as those involved in cytochrome P450 (CYP) metabolism. The paper introduces a transferable machine learning interatomic potential model called MACE, which can be used for open and closed-shell drug analogs containing hydrogen, carbon, and oxygen atoms. Due to its precise description of radicals, the MACE model expands the application range in bond dissociation energy prediction, which is crucial for understanding the catalytic metabolism process of CYP enzymes. The transferability of the MACE model is validated using a dataset called COMP6, which only contains closed-shell molecules, and it outperforms the existing universal ANI-2x potential in terms of accuracy. On the CYP-specific dataset, the MACE model also demonstrates comparable or better predictive performance than the ANI-2x, AM1, GFN2-xTB methods, and the ALFABET model, particularly in predicting C-H bond dissociation energy (BDE). Additionally, the MACE model exhibits smoothness in simulating the elongation path of sp3 carbon-hydrogen bonds, where a reasonably minimal energy path can be found with moderate extension, which is particularly useful for simulating methoxy radical-mediated hydrogen abstraction reactions. The paper emphasizes the potential of the MACE model in handling the complexity of drug molecule chemical space and the extension of CYP catalytic reaction types, not limited to bond dissociation energy but also applicable to modeling complete reaction pathways.