Ab Initio Molecular Dynamics Simulations of Atmospheric Molecular Clusters Boosted by Neural Networks

Jakub Kubečka,Daniel Ayoubi,Zeyuan Tang,Yosef Knattrup,Morten Engsvang,Haide Wu,Jonas Elm
DOI: https://doi.org/10.26434/chemrxiv-2024-fj8s9
2024-07-01
Abstract:The computational cost of accurate quantum chemistry (QC) calculations of large molecular systems can often be unbearably high. Machine learning offers a lower computational cost compared to QC methods while maintaining their accuracy. In this study, we employ the polarizable atom interaction neural network (PaiNN) architecture to train and model the potential energy surface of molecular clusters relevant to atmospheric new particle formation, such as sulfuric acid–ammonia clusters. We compare the differences between the neural network and previous kernel ridge regression modeling for the Clusteromics I–V data sets. We showcase three models capable of predicting electronic binding energies and interatomic forces with mean absolute errors of <0.3 kcal/mol and <0.2 kcal/mol/ ̊A, respectively. Furthermore, we demonstrate that the error of the modeled properties remains below the chemical accuracy of 1 kcal/mol even for clusters vastly larger than those in the training database (up to (H2SO4)15(NH3)15 clusters, containing 30 molecules). Consequently, we emphasize the potential applications of these models for faster and more thorough configurational sampling and for boosting molecular dynamics studies of large atmospheric molecular clusters.
Chemistry
What problem does this paper attempt to address?
This paper mainly discusses how to use a neural network (PaiNN architecture) to accelerate quantum chemical calculations of atmospheric molecular clusters, especially sulfuric acid-ammonia clusters related to new particle formation. Current quantum chemical methods are computationally expensive for large molecular systems, while machine learning can provide lower computational cost while maintaining accuracy. The researchers compared the performance of the PaiNN neural network model with the previous Kernel Ridge Regression (KRR) method on the Clusteromics I-V dataset. They demonstrated three models that can predict electron binding energies with a mean absolute error of less than 0.3 kcal/mol and atomic forces with an error of less than 0.2 kcal/mol/Å. These models maintain chemical accuracy (1 kcal/mol) even on large clusters not present in the training database, such as (H2SO4)15(NH3)15 clusters containing 30 molecules. The paper also points out that although KRR performs well on small-scale data, it is computationally expensive and not suitable for scaling to large configuration spaces. In contrast, neural network (NN) models can handle larger and more complex databases faster, facilitating molecular dynamics research and improving understanding of atmospheric molecular cluster formation and growth. The databases used in the study include clusters of sulfuric acid and water, sulfuric acid and ammonia, and small clusters containing various new particle precursors. By comparing different theoretical levels and database complexities, the authors evaluated the applicability of the NN model and emphasized the importance of data preparation consistency for model accuracy. In conclusion, this paper aims to address the high cost of computationally calculating atmospheric molecular clusters using traditional quantum chemical methods. It proposes a new approach based on neural networks that is not only fast but also has high accuracy in predicting energy and forces, making it potentially applicable to future large-scale molecular dynamics research.