Electrostatic Embedding Machine Learning for Ground and Excited State Molecular Dynamics of Solvated Molecules

Patrizia Mazzeo,Edoardo Cignoni,Amanda Arcidiacono,Lorenzo Cupellini,Benedetta Mennucci
DOI: https://doi.org/10.26434/chemrxiv-2024-65bxp
2024-10-03
Abstract:The application of quantum mechanics (QM) / molecular mechanics (MM) models for studying dynamics in complex systems is nowadays well established. However, their significant limitation is the high computational cost, which restricts their use for larger systems and long-timescale processes. We propose a machine-learning (ML) based approach to study the dynamics of solvated molecules on the ground- and excited-state potential energy surfaces. Our ML model is trained on QM/MM calculations and is designed to predict energies and forces within an electrostatic embedding framework. We built a socket-based interface of our machinery with AMBER to run ML/MM molecular dynamics simulations. As an application, we investigated the excited state intramolecular proton transfer of 3-hydroxyflavone in two different solvents: methanol and methylcyclohexane. Our ML/MM simulations accurately distinguished between the two solvents, effectively reproducing the solvent effects on proton transfer dynamics.
Chemistry
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the high computational cost issue of the quantum mechanics/molecular mechanics (QM/MM) method when simulating the dynamics of complex systems. Specifically, the author proposes a machine - learning (ML) - based method to study the dynamics of solvated molecules on the ground - state and excited - state potential energy surfaces. This method aims to reduce the computational cost by training ML models to predict energy and force, enabling simulations on longer time scales while maintaining the accuracy of quantum mechanics. In addition, this method also pays special attention to how to effectively introduce charge embedding in the ML/MM framework to accurately describe the influence of the environment on molecular dynamics. ### Main Problems 1. **High Computational Cost**: Although the traditional QM/MM method can accurately simulate the properties and processes of complex systems, its computational cost is very high, which limits its application in larger systems and long - time - scale processes. 2. **Accurate Description of Environmental Effects**: When simulating the dynamics of solvated molecules, how to accurately describe the influence of the solvent on molecular behavior is a key issue. ### Solutions 1. **Machine - Learning Model**: The author has developed a machine - learning - based model, which is trained under the charge - embedding framework and can predict the energy and force of the ground state and the excited state. 2. **Interface with AMBER**: The author has constructed a socket - based interface to integrate the ML model with the AMBER software package for ML/MM molecular dynamics simulations. 3. **Charge Embedding**: By introducing charge embedding in the ML/MM model, the influence of the environment on molecular dynamics is accurately described. ### Application Examples As an application example, the author has studied the excited - state intramolecular proton transfer process of 3 - hydroxyflavone in two different solvents (methanol and methylcyclohexane). The ML/MM simulation results accurately distinguish the two solvents and effectively reproduce the influence of the solvent on the proton transfer dynamics. ### Method Overview 1. **ML Model**: - **Vacuum Model**: It only depends on the geometric structure of the QM part and uses the inverse - distance (ID) descriptor. - **Environment - Shifted Model**: It takes into account the QM - MM interaction and uses the charge - embedding descriptor. 2. **Gaussian Process Regression (GPR)**: It is used to predict energy and force and ensure energy conservation. 3. **Dataset Generation**: An active learning strategy is adopted to generate the training dataset to ensure that it contains important geometric configurations on the reaction path. 4. **Molecular Dynamics Simulation**: The ML/MM method is used to perform molecular dynamics simulations of the NVT and NVE ensembles to verify the accuracy and stability of the model. ### Results and Discussion 1. **Model Validation**: The accuracy of the model has been verified on the geometric configuration of 3 - hydroxyflavone in methanol, and the results show that the model has high precision in predicting energy and force. 2. **Dynamics Simulation**: The ML/MM simulation results successfully reproduce the proton transfer dynamics of 3 - hydroxyflavone in different solvents, verifying the effectiveness of the method. Through these methods, the author has not only solved the problem of high computational cost but also provided a new way to accurately describe environmental effects in complex systems.