All-in-one foundational models learning across quantum chemical levels

Pavlo O. Dral,Yuxinxin Chen

DOI: https://doi.org/10.26434/chemrxiv-2024-ng3ws

2024-09-18

Abstract:Machine learning (ML) potentials typically target a single quantum chemical (QC) level while the ML models developed for multi-fidelity learning have not been shown to provide scalable solutions for foundational models. Here we introduce the all-in-one (AIO) ANI model architecture based on multimodal learning which can learn an arbitrary number of QC levels. Our all-in-one learning approach offers a more general and easier-to-use alternative to transfer learning. We use it to train the AIO-ANI-UIP foundational model with the generalization capability comparable to semi-empirical GFN2-xTB and DFT with a double-zeta basis set for organic molecules. We show that the AIO-ANI model can learn across different QC levels ranging from semi-empirical to density functional theory to coupled cluster. We also use AIO models to design the foundational model Δ-AIO-ANI based on Δ-learning with increased accuracy and robustness compared to AIO-ANI-UIP. The code and the foundational models are available at https://github.com/dralgroup/aio-ani; they will be integrated into the universal and updatable AI-enhanced QM (UAIQM) library and made available in the MLatom package so that they can be used online at the XACS cloud computing platform (see https://github.com/dralgroup/mlatom for updates).

Chemistry

What problem does this paper attempt to address?

The paper attempts to address the challenges faced when training machine learning (ML) potential models at different theoretical levels of quantum chemistry. Specifically, the authors propose a novel "All-in-one" (AIO) model architecture that can simultaneously learn from any number of quantum chemistry (QC) levels. The main objectives include: 1. **Simplifying multi-level learning**: Traditional transfer learning methods require training different models for each level separately, whereas the AIO model can achieve learning across multiple QC levels within a single model. 2. **Improving prediction accuracy**: By combining data from various QC levels, the AIO model can provide prediction accuracy comparable to semi-empirical methods and density functional theory (DFT) while maintaining high computational speed. 3. **Enhancing model generality**: The AIO model can make predictions for different QC levels without the need to train separate models for each level. 4. **Improving Δ-learning methods**: Utilizing the AIO model to generate Δ-learning corrections further enhances the model's stability and accuracy. In summary, the paper aims to address the issues in multi-level data learning through a simple and scalable approach, thereby improving the application of machine learning in the field of quantum chemistry.

All-in-one foundational models learning across quantum chemical levels

All-in-one foundational models learning across quantum chemical levels

Universal and Updatable Artificial Intelligence-Enhanced Quantum Chemical Foundational Models

Learning Together: Towards foundational models for machine learning interatomic potentials with meta-learning

Learning together: Towards foundation models for machine learning interatomic potentials with meta-learning

Constructing accurate and efficient general-purpose atomistic machine learning model with transferable accuracy for quantum chemistry

Quantum machine learning using atom-in-molecule-based fragments selected on-the-fly

Less is more: sampling chemical space with active learning

AIQM2: Better Reaction Simulations with the 2nd Generation of General-Purpose AI-enhanced Quantum Mechanical Method

The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules

Towards Accurate and Efficient Anharmonic Vibrational Frequencies with the Universal Interatomic Potential ANI-1ccx-gelu and Its Fine-Tuning

Accurate and Affordable Simulation of Molecular Infrared Spectra with AIQM Models

Modern Semiempirical Electronic Structure Methods and Machine Learning Potentials for Drug Discovery: Conformers, Tautomers, and Protonation States

Hierarchical Transfer Learning: An Agile and Equitable Strategy for Machine-Learning Interatomic Models

QDπ: A Quantum Deep Potential Interaction Model for Drug Discovery

A Foundation Model for Chemical Design and Property Prediction

A foundation model for atomistic materials chemistry

Physics-informed active learning for accelerating quantum chemical simulations

ANI-1: A data set of 20M off-equilibrium DFT calculations for organic molecules

Exploring the frontiers of condensed-phase chemistry with a general reactive machine learning potential

Optimized Multifidelity Machine Learning for Quantum Chemistry