DPA-2: a large atomic model as a multi-task learner

Duo Zhang,Xinzijian Liu,Xiangyu Zhang,Chengqian Zhang,Chun Cai,Hangrui Bi,Yiming Du,Xuejian Qin,Anyang Peng,Jiameng Huang,Bowen Li,Yifan Shan,Jinzhe Zeng,Yuzhi Zhang,Siyuan Liu,Yifan Li,Junhan Chang,Xinyan Wang,Shuo Zhou,Jianchuan Liu,Xiaoshan Luo,Zhenyu Wang,Wanrun Jiang,Jing Wu,Yudi Yang,Jiyuan Yang,Manyi Yang,Fu-Qiang Gong,Linshuang Zhang,Mengchao Shi,Fu-Zhi Dai,Darrin M. York,Shi Liu,Tong Zhu,Zhicheng Zhong,Jian Lv,Jun Cheng,Weile Jia,Mohan Chen,Guolin Ke,Weinan E,Linfeng Zhang,Han Wang

2024-08-16

Abstract:The rapid advancements in artificial intelligence (AI) are catalyzing transformative changes in atomic modeling, simulation, and design. AI-driven potential energy models have demonstrated the capability to conduct large-scale, long-duration simulations with the accuracy of ab initio electronic structure methods. However, the model generation process remains a bottleneck for large-scale applications. We propose a shift towards a model-centric ecosystem, wherein a large atomic model (LAM), pre-trained across multiple disciplines, can be efficiently fine-tuned and distilled for various downstream tasks, thereby establishing a new framework for molecular modeling. In this study, we introduce the DPA-2 architecture as a prototype for LAMs. Pre-trained on a diverse array of chemical and materials systems using a multi-task approach, DPA-2 demonstrates superior generalization capabilities across multiple downstream tasks compared to the traditional single-task pre-training and fine-tuning methodologies. Our approach sets the stage for the development and broad application of LAMs in molecular and materials simulation research.

Chemical Physics,Materials Science,Computational Physics

What problem does this paper attempt to address?

The main objective of this paper is to propose and develop a Large Atomic Model (LAM), specifically introducing an architecture named DPA-2 as a prototype of LAM. DPA-2 is trained on various chemical and material systems through a multi-task pre-training approach, aiming to address the following core issues: 1. **Simplifying the Model Generation Process**: Traditionally, Machine Learning Potentials (MLPs) require a large amount of first-principles data for training, which is often a time-consuming and resource-intensive process. DPA-2 simplifies the model generation process by adopting a multi-task pre-training method, enabling efficient training on diverse datasets. 2. **Improving Model Generalization**: Traditional single-task training strategies limit the model's performance on unseen data. DPA-2 leverages multi-task pre-training techniques to integrate datasets from different sources and with different Density Functional Theory (DFT) settings within a unified framework, significantly enhancing the model's generalization capability. 3. **Building a Model-Centric Ecosystem**: The paper also aims to establish a model-centric ecosystem where pre-trained large atomic models can be effectively fine-tuned and distilled for various downstream tasks, such as molecular modeling. This new framework provides the potential for large-scale applications in molecular and material simulation research. 4. **Achieving Efficient Downstream Task Processing**: To make the pre-trained model suitable for specific application scenarios, the paper proposes a workflow that includes pre-training, fine-tuning, and knowledge distillation. The pre-training phase employs a multi-task training strategy, the fine-tuning phase adjusts the model according to specific application scenarios, and finally, knowledge distillation creates more efficient simplified models for practical use. In summary, the core of this paper is the proposal of a new Machine Learning Potential model—DPA-2, which not only effectively utilizes diverse datasets for pre-training but also adapts to different downstream tasks through fine-tuning and knowledge distillation, greatly expanding the application scope of machine learning in the fields of material science and molecular simulation.

DPA-2: a large atomic model as a multi-task learner

DPA-2: Towards a universal large atomic model for molecular and material simulation

DPA-1: Pretraining of Attention-based Deep Potential Model for Molecular Simulation

Pretraining of attention-based deep learning potential model for molecular simulation

Constructing accurate and efficient general-purpose atomistic machine learning model with transferable accuracy for quantum chemistry

ChemDFM: A Large Language Foundation Model for Chemistry

Combining Machine Learning Potential and Structure Prediction for Accelerated Materials Design and Discovery

Overcoming the Size Limit of First Principles Molecular Dynamics Simulations with an In-Distribution Substructure Embedding Active Learner

Accurate and efficient molecular dynamics based on machine learning and non von Neumann architecture

Large-Scale Atomic Simulation via Machine Learning Potentials Constructed by Global Potential Energy Surface Exploration

Deep Learning for Multi-Scale Molecular Modeling

From Molecules to Materials: Pre-training Large Generalizable Models for Atomic Property Prediction

A Foundation Model for Chemical Design and Property Prediction

A foundation model for atomistic materials chemistry

Multitask Deep Learning with Dynamic Task Balancing for Quantum Mechanical Properties Prediction

Efficient Machine Learning Force Field for Large-Scale Molecular Simulations of Organic Systems

High-speed and low-power molecular dynamics processing unit (MDPU) with ab initio accuracy

Atomistic Modeling of Lithium Materials from Deep Learning Potential with Ab Initio Accuracy

A Perspective on Deep Learning for Molecular Modeling and Simulations

SciDFM: A Large Language Model with Mixture-of-Experts for Science