DPA-2: a large atomic model as a multi-task learner

Duo Zhang,Xinzijian Liu,Xiangyu Zhang,Chengqian Zhang,Chun Cai,Hangrui Bi,Yiming Du,Xuejian Qin,Anyang Peng,Jiameng Huang,Bowen Li,Yifan Shan,Jinzhe Zeng,Yuzhi Zhang,Siyuan Liu,Yifan Li,Junhan Chang,Xinyan Wang,Shuo Zhou,Jianchuan Liu,Xiaoshan Luo,Zhenyu Wang,Wanrun Jiang,Jing Wu,Yudi Yang,Jiyuan Yang,Manyi Yang,Fu-Qiang Gong,Linshuang Zhang,Mengchao Shi,Fu-Zhi Dai,Darrin M. York,Shi Liu,Tong Zhu,Zhicheng Zhong,Jian Lv,Jun Cheng,Weile Jia,Mohan Chen,Guolin Ke,Weinan E,Linfeng Zhang,Han Wang
2024-08-16
Abstract:The rapid advancements in artificial intelligence (AI) are catalyzing transformative changes in atomic modeling, simulation, and design. AI-driven potential energy models have demonstrated the capability to conduct large-scale, long-duration simulations with the accuracy of ab initio electronic structure methods. However, the model generation process remains a bottleneck for large-scale applications. We propose a shift towards a model-centric ecosystem, wherein a large atomic model (LAM), pre-trained across multiple disciplines, can be efficiently fine-tuned and distilled for various downstream tasks, thereby establishing a new framework for molecular modeling. In this study, we introduce the DPA-2 architecture as a prototype for LAMs. Pre-trained on a diverse array of chemical and materials systems using a multi-task approach, DPA-2 demonstrates superior generalization capabilities across multiple downstream tasks compared to the traditional single-task pre-training and fine-tuning methodologies. Our approach sets the stage for the development and broad application of LAMs in molecular and materials simulation research.
Chemical Physics,Materials Science,Computational Physics
What problem does this paper attempt to address?
The main objective of this paper is to propose and develop a Large Atomic Model (LAM), specifically introducing an architecture named DPA-2 as a prototype of LAM. DPA-2 is trained on various chemical and material systems through a multi-task pre-training approach, aiming to address the following core issues: 1. **Simplifying the Model Generation Process**: Traditionally, Machine Learning Potentials (MLPs) require a large amount of first-principles data for training, which is often a time-consuming and resource-intensive process. DPA-2 simplifies the model generation process by adopting a multi-task pre-training method, enabling efficient training on diverse datasets. 2. **Improving Model Generalization**: Traditional single-task training strategies limit the model's performance on unseen data. DPA-2 leverages multi-task pre-training techniques to integrate datasets from different sources and with different Density Functional Theory (DFT) settings within a unified framework, significantly enhancing the model's generalization capability. 3. **Building a Model-Centric Ecosystem**: The paper also aims to establish a model-centric ecosystem where pre-trained large atomic models can be effectively fine-tuned and distilled for various downstream tasks, such as molecular modeling. This new framework provides the potential for large-scale applications in molecular and material simulation research. 4. **Achieving Efficient Downstream Task Processing**: To make the pre-trained model suitable for specific application scenarios, the paper proposes a workflow that includes pre-training, fine-tuning, and knowledge distillation. The pre-training phase employs a multi-task training strategy, the fine-tuning phase adjusts the model according to specific application scenarios, and finally, knowledge distillation creates more efficient simplified models for practical use. In summary, the core of this paper is the proposal of a new Machine Learning Potential model—DPA-2, which not only effectively utilizes diverse datasets for pre-training but also adapts to different downstream tasks through fine-tuning and knowledge distillation, greatly expanding the application scope of machine learning in the fields of material science and molecular simulation.