All-in-one foundational models learning across quantum chemical levels

Pavlo O. Dral,Yuxinxin Chen
DOI: https://doi.org/10.26434/chemrxiv-2024-ng3ws
2024-09-18
Abstract:Machine learning (ML) potentials typically target a single quantum chemical (QC) level while the ML models developed for multi-fidelity learning have not been shown to provide scalable solutions for foundational models. Here we introduce the all-in-one (AIO) ANI model architecture based on multimodal learning which can learn an arbitrary number of QC levels. Our all-in-one learning approach offers a more general and easier-to-use alternative to transfer learning. We use it to train the AIO-ANI-UIP foundational model with the generalization capability comparable to semi-empirical GFN2-xTB and DFT with a double-zeta basis set for organic molecules. We show that the AIO-ANI model can learn across different QC levels ranging from semi-empirical to density functional theory to coupled cluster. We also use AIO models to design the foundational model Δ-AIO-ANI based on Δ-learning with increased accuracy and robustness compared to AIO-ANI-UIP. The code and the foundational models are available at https://github.com/dralgroup/aio-ani; they will be integrated into the universal and updatable AI-enhanced QM (UAIQM) library and made available in the MLatom package so that they can be used online at the XACS cloud computing platform (see https://github.com/dralgroup/mlatom for updates).
Chemistry
What problem does this paper attempt to address?
The paper attempts to address the challenges faced when training machine learning (ML) potential models at different theoretical levels of quantum chemistry. Specifically, the authors propose a novel "All-in-one" (AIO) model architecture that can simultaneously learn from any number of quantum chemistry (QC) levels. The main objectives include: 1. **Simplifying multi-level learning**: Traditional transfer learning methods require training different models for each level separately, whereas the AIO model can achieve learning across multiple QC levels within a single model. 2. **Improving prediction accuracy**: By combining data from various QC levels, the AIO model can provide prediction accuracy comparable to semi-empirical methods and density functional theory (DFT) while maintaining high computational speed. 3. **Enhancing model generality**: The AIO model can make predictions for different QC levels without the need to train separate models for each level. 4. **Improving Δ-learning methods**: Utilizing the AIO model to generate Δ-learning corrections further enhances the model's stability and accuracy. In summary, the paper aims to address the issues in multi-level data learning through a simple and scalable approach, thereby improving the application of machine learning in the field of quantum chemistry.