AIMNet2: A Neural Network Potential to Meet your Neutral, Charged, Organic, and Elemental-Organic Needs

Olexandr Isayev,Dylan Anstine,Roman Zubatyuk
DOI: https://doi.org/10.26434/chemrxiv-2023-296ch-v2
2024-04-30
Abstract:Machine learned interatomic potentials (MLIPs) are reshaping computational chemistry practices because of their ability to drastically exceed the accuracy-length/time scale tradeoff. Despite this attraction, the benefits of such efficiency are only impactful when an MLIP uniquely enables insight into a target system or is broadly transferable outside of the training dataset, where models achieving the latter are seldom reported. In this work, we present the 2nd generation of our atoms-in-molecules neural network potential (AIMNet2), which is applicable to species composed of up to 14 chemical elements in both neutral and charged states, making it a valuable model for modeling the majority of non-metallic compounds. Using an exhaustive dataset of 20 million hybrid quantum chemical calculations, AIMNet2 combines ML-parameterized short-range and physics-based long-range terms to attain generalizability that reaches from simple organics to diverse molecules with “exotic” element-organic bonding. We show that AIMNet2 outperforms semi-empirical GFN-xTB and is on par with reference density functional theory for interaction energy contributions, conformer search tasks, torsion rotation profiles, and molecular-to-macromolecular geometry optimization. Overall, the demonstrated chemical coverage and computational efficiency of AIMNet2 is a significant step toward providing access to MLIPs that avoid the crucial limitation of curating additional quantum chemical data and retraining with each new application.
Chemistry
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is to develop a general and efficient machine - learning inter - atomic potential (MLIP) that can be widely applied to compounds with different chemical compositions and charge states, especially those molecules with complex or atypical bonding patterns. Specifically, the paper proposes a second - generation model based on the Atom - Molecule Neural Network Potential (AIMNet2), aiming to overcome the limitations of existing MLIP models in chemical space coverage and generalization ability. ### Main problems and goals: 1. **Improve chemical space coverage**: - Existing MLIP models can usually only handle specific systems or a small number of compounds, which limits their application range. The goal of the paper is to develop an MLIP model that can handle up to 14 chemical elements (including neutral and charged states), thus being suitable for most non - metallic compounds. 2. **Enhance generalization ability**: - The paper emphasizes the generalization ability of the model, that is, it can maintain high accuracy in chemical systems outside the training data set. By combining short - range and long - range physical interactions, AIMNet2 can be extended from simple organic molecules to complex molecules with "exotic" element - organic bonding. 3. **Improve computational efficiency**: - One of the main advantages of MLIP is that its computational efficiency is much higher than that of traditional quantum - chemical calculations. AIMNet2 is trained with a data set of 20 million hybrid quantum - chemical calculations, achieving a significant improvement in computational speed while maintaining high accuracy. ### Key technical means: - **Data Distillation**: Build a training set by gradually selecting the most representative data points, thereby reducing redundancy and maximizing the contribution of each data point to model optimization. - **Message - Passing Architecture**: Used to describe the characteristics of the atomic environment and capture complex chemical details by iteratively updating atomic embeddings and partial charges. - **Long - range interactions**: Combine explicit dispersion corrections and electrostatic interactions to ensure that the model can accurately describe long - range effects. ### Summary: The success of AIMNet2 lies in its ability to not only efficiently simulate a wide range of chemical systems but also perform excellently when dealing with complex or atypical bonding patterns. This provides a powerful tool for future chemical research, especially in cases where it is necessary to quickly screen a large number of molecular structures or conduct large - scale material modeling.