Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
Notebook for DPA-2: a large atomic model as a multi-task learner
DeePMD-kit
DPA-2
DeePMD-kitDPA-2
2043899742@qq.com
AIS-Square
更新于 2024-10-14
推荐镜像 :DeePMD-kit:3.0.0b3-cuda12.1
推荐机型 :c8_m32_1 * NVIDIA V100
赞 20
45
54
dpa2-finetune-example-water(v2)
dpa2_data(v6)

Open In Bohrium

代码
文本

©️ Copyright 2023 @ Authors
Authors: Xinzijian Liu📨 Chengqian Zhang📨
Date: 2023-12-20
License: Attribution-NonCommercial-ShareAlike 4.0 International
Quick start: You can click on the blue button at the top of the page Connect , select `registry.dp.tech/dptech/deepmd-kit:3.0.0b3-cuda12.1` image and `c12_m92_1 * NVIDIA V100` machine, and at the same time mount the `dpa2_data(v6)` and `dpa2-finetune-example-water(v2)` dataset, and wait for a few moments to run.

代码
文本

Introduction

代码
文本

Paper link: https://arxiv.org/abs/2312.15492

The codes, datasets and input scripts are all available on zenodo (https://doi.org/10.5281/zenodo.10428497)

代码
文本

This notebook is updated on 9.14 based on beta3 code version.

代码
文本

The rapid development of artificial intelligence (AI) is driving significant changes in the field of atomic modeling, simulation, and design. AI-based potential energy models have been successfully used to perform large-scale and long-time simulations with the accuracy of ab initio electronic structure methods. However, the model generation process still hinders applications at scale. We envision that the next stage would be a model-centric ecosystem, in which a large atomic model (LAM), pre-trained with as many atomic datasets as possible and can be efficiently fine-tuned and distilled to downstream tasks, would serve the new infrastructure of the field of molecular modeling. We propose DPA-2, a novel architecture for a LAM, and develop a comprehensive pipeline for model fine-tuning, distillation, and application, associated with automatic workflows. We show that DPA-2 can accurately represent a diverse range of chemical systems and materials, enabling high-quality simulations and predictions with significantly reduced efforts compared to traditional methods. Our approach paves the way for a universal large atomic model that can be widely applied in molecular and material simulation research, opening new opportunities for scientific discoveries and industrial applications.

代码
文本

In order to run this notebook successfully, let's do some preparatory work first.

代码
文本
[2]
%%bash
cd /root/
cp -r /bohr/qscft-rbns/v6/src/ ./
tree -L 3
.
└── src
    ├── data
    │   ├── FerroEle_train
    │   ├── FerroEle_valid
    │   ├── H2O-PD_train
    │   ├── H2O-PD_valid
    │   ├── SemiCond_train
    │   └── SemiCond_valid
    ├── md
    │   └── water_192
    ├── model
    │   ├── H2O-PD.pt
    │   └── OpenLAM_2.2.0_27heads_beta3.pt
    └── train
        ├── finetune
        ├── multitask
        └── singletask

15 directories, 2 files
代码
文本
  • data: This directory contains three datasets FerroEle, H2O-PD and SemiCond. FerroEle is a small subset of the dataset FerroEle_DPA_v1_0. H2O_H2O-PD is a small subset of the dataset H2O-PD_DPA_v1_0. SemiCond is a small subset of the dataset SemiCond_DPA_v1_0.If you want to get the full dataset, you can go to website AIS Square to download these datasets.
  • model: This directory contains one singletask model H2O-PD.pt and one multitask model OpenLAM_2.2.0_27heads_beta3.pt. The multitask model OpenLAM_2.2.0_27heads_beta3.pt is trained on 27 different datasets.
  • train: This directory contains three modes of training, which are singletask training, multitask training, and finetuning based on pretrained model. In a moment, we're going to demonstrate how to do these trainings in the three folders.
  • md: In this directory, we will demonstrate how to perform molecular dynamics simulation using DPA-2 model.
代码
文本

Model Loading

代码
文本

One can download someone else's trained singletask model or multitask model from web AIS Square

代码
文本
[3]
cd /root/src/data
/root/src/data
/opt/deepmd-kit-3.0.0b3/lib/python3.10/site-packages/IPython/core/magics/osm.py:417: UserWarning: This is now an optional IPython functionality, setting dhist requires you to install the `pickleshare` library.
  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]
代码
文本

command line interface

代码
文本

The model can be used in many ways. The most straightforward test can be performed using dp test

代码
文本

Firstly we test the valid dataset of H2O_H2O-PD using the singletask model H2O-PD.pt.

代码
文本
[4]
!dp test -m ../model/H2O-PD.pt -n 5 -s H2O-PD_valid
已隐藏输出
代码
文本

where -m gives the model checkpoint to import, -s the path to the tested system and -n the number of tested frames. Several other command line options can be passed to dp test, which can be checked with

代码
文本
[5]
!dp test --help
usage: dp test [-h] [-v {DEBUG,3,INFO,2,WARNING,1,ERROR,0}] [-l LOG_PATH]
               [-m MODEL] [-s SYSTEM | -f DATAFILE] [-S SET_PREFIX]
               [-n NUMB_TEST] [-r RAND_SEED] [--shuffle-test] [-d DETAIL_FILE]
               [-a] [--head HEAD]

options:
  -h, --help            show this help message and exit
  -v {DEBUG,3,INFO,2,WARNING,1,ERROR,0}, --log-level {DEBUG,3,INFO,2,WARNING,1,ERROR,0}
                        set verbosity level by string or number, 0=ERROR, 1=WARNING, 2=INFO and 3=DEBUG (default: INFO)
  -l LOG_PATH, --log-path LOG_PATH
                        set log file to log messages to disk, if not specified, the logs will only be output to console (default: None)
  -m MODEL, --model MODEL
                        Frozen model file (prefix) to import. TensorFlow backend: suffix is .pb; PyTorch backend: suffix is .pth. (default: frozen_model)
  -s SYSTEM, --system SYSTEM
                        The system dir. Recursively detect systems in this directory (default: .)
  -f DATAFILE, --datafile DATAFILE
                        The path to the datafile, each line of which is a path to one data system. (default: None)
  -S SET_PREFIX, --set-prefix SET_PREFIX
                        [DEPRECATED] Deprecated argument. (default: None)
  -n NUMB_TEST, --numb-test NUMB_TEST
                        The number of data for test. 0 means all data. (default: 0)
  -r RAND_SEED, --rand-seed RAND_SEED
                        The random seed (default: None)
  --shuffle-test        Shuffle test data (default: False)
  -d DETAIL_FILE, --detail-file DETAIL_FILE
                        The prefix to files where details of energy, force and virial accuracy/accuracy per atom will be written (default: None)
  -a, --atomic          Test the accuracy of atomic label, i.e. energy / tensor (dipole, polar) (default: False)
  --head HEAD           (Supported backend: PyTorch) Task head to test if in multi-task mode. (default: None)

examples:
    dp test -m graph.pb -s /path/to/system -n 30
代码
文本

We then use the multitask model OpenLAM_2.2.0_27heads_beta3.pt to test the H2O_H2O-PD valid dataset, and it's worth noting that when we use the multitask model to do the testing we need to specify the task head.

代码
文本
[6]
!dp test -m ../model/OpenLAM_2.2.0_27heads_beta3.pt -n 5 -s H2O-PD_valid --head H2O_H2O-PD
已隐藏输出
代码
文本
  • --head: Task head to test if in multi-task mode.
代码
文本

python interface

代码
文本

One can use python interface of DPA-2 to obtain the energy, force, virial of specific structures.

代码
文本
[7]
import torch
from deepmd.pt.infer.deep_eval import DeepPot
import numpy as np

# This structure is 192-atom water
atype = [7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
coords = np.array([[[ 4.88086987e+00, 5.08849001e+00, 5.81736994e+00],
[ 1.67288995e+00, 2.79171991e+00, 6.14008999e+00],
[ 6.33997011e+00, 5.87844992e+00, 1.45858002e+00],
[ 5.30080986e+00, 7.28452015e+00, 6.98121023e+00],
[ 3.75292993e+00, 5.94418001e+00, 8.13750029e-01],
[ 5.65410995e+00, 6.82400018e-02, 4.64956999e+00],
[ 4.69849014e+00, 1.07639998e-01, 2.39014006e+00],
[ 3.33181000e+00, 3.83896995e+00, 7.35652018e+00],
[ 1.38023996e+00, 2.30410004e+00, 3.73708010e+00],
[ 3.24392009e+00, 3.83446002e+00, 4.62864017e+00],
[ 6.81804991e+00, 1.77417004e+00, 9.16799977e-02],
[ 1.44351006e+00, 4.67081022e+00, 5.27610004e-01],
[ 3.63952994e+00, 6.09609985e+00, 3.87172008e+00],
[ 6.60783005e+00, 3.06886005e+00, 5.08682013e+00],
[ 4.59574986e+00, 2.11696005e+00, 1.23446000e+00],
[ 9.44079995e-01, 6.51146984e+00, 3.75208998e+00],
[ 1.95975995e+00, 2.45908999e+00, 1.20615995e+00],
[ 1.90281999e+00, 4.64120007e+00, 2.83933997e+00],
[ 4.15106010e+00, 4.22786999e+00, 2.35461998e+00],
[ 3.15064001e+00, 1.02985001e+00, 4.58635998e+00],
[ 2.36810994e+00, 3.50479990e-01, 2.19712996e+00],
[ 6.75059986e+00, 3.17266989e+00, 2.38828993e+00],
[ 4.99685001e+00, 2.19505000e+00, 3.72282004e+00],
[ 2.89387989e+00, 8.82499993e-01, 6.97205019e+00],
[ 6.02069998e+00, 5.09185982e+00, 3.67828989e+00],
[ 4.65655994e+00, 2.32630992e+00, 6.16604996e+00],
[ 6.03310013e+00, 3.99452996e+00, 1.01709999e-01],
[ 1.11740005e+00, 6.97051001e+00, 3.97839993e-01],
[ 8.78780007e-01, 6.62689984e-01, 5.63720989e+00],
[ 2.55311990e+00, 5.88978004e+00, 5.95164013e+00],
[ 2.21489996e-01, 5.52580976e+00, 5.79211998e+00],
[-9.17000044e-03, 9.28219974e-01, 2.29167008e+00],
[ 3.98556995e+00, 4.34110022e+00, 5.18548012e+00],
[ 4.21743011e+00, 5.72653008e+00, 6.09771013e+00],
[ 5.81910014e-01, 2.03183007e+00, 7.19089985e+00],
[ 2.31051993e+00, 2.11350989e+00, 6.47226000e+00],
[ 6.95373011e+00, 5.62523985e+00, 2.17477989e+00],
[ 6.72261000e+00, 6.29493999e+00, 5.44109011e+00],
[ 5.46083021e+00, 7.31151009e+00, 5.98238993e+00],
[ 3.88695002e+00, 6.64919972e-01, 6.95054007e+00],
[ 4.55715990e+00, 6.37985992e+00, 4.42229986e-01],
[ 2.90130997e+00, 3.28073001e+00, 6.15589976e-01],
[ 5.54563999e+00, 1.08753002e+00, 4.38471985e+00],
[ 5.16894007e+00, 6.81580019e+00, 4.15231991e+00],
[ 4.61714983e+00, 1.15058994e+00, 1.42268002e+00],
[ 5.75011015e+00, 2.56732011e+00, 3.15859008e+00],
[ 1.59423006e+00, 3.51257992e+00, 6.73897982e+00],
[ 8.17659974e-01, 9.93100032e-02, 6.87493992e+00],
[ 2.15335989e+00, 1.75983000e+00, 4.06019020e+00],
[ 9.04070020e-01, 1.67244995e+00, 3.11746001e+00],
[ 2.53470993e+00, 3.43996000e+00, 5.22603989e+00],
[ 3.78940010e+00, 3.18470001e+00, 4.03200006e+00],
[ 5.76567984e+00, 6.45699978e-01, 7.30638981e+00],
[ 6.27276993e+00, 2.76024008e+00, 7.39845991e+00],
[ 1.63831997e+00, 4.72740984e+00, 1.57593000e+00],
[ 5.38829982e-01, 4.33784008e+00, 5.76900005e-01],
[ 3.06592011e+00, 5.92568016e+00, 4.67959976e+00],
[ 2.18319988e+00, 7.26775980e+00, 3.03326988e+00],
[ 3.42449993e-01, 2.80690002e+00, 4.48589993e+00],
[ 5.42928982e+00, 2.70057988e+00, 5.66814995e+00],
[ 2.26852989e+00, 1.73099005e+00, 1.78567004e+00],
[ 4.41227007e+00, 3.39329004e+00, 1.89870000e+00],
[ 6.81015015e+00, 5.67325020e+00, 3.87030005e+00],
[ 5.66749990e-01, 7.17967987e+00, 3.08417010e+00],
[ 1.17006004e+00, 2.81805992e+00, 1.67999995e+00],
[ 1.70717001e+00, 3.76209998e+00, 3.24534011e+00],
[ 3.25989008e+00, 5.42201996e+00, 3.27817988e+00],
[ 1.48038006e+00, 5.37980986e+00, 3.36378002e+00],
[ 4.97062016e+00, 4.61452007e+00, 2.79702997e+00],
[ 4.06063986e+00, 5.42372990e+00, 1.59935999e+00],
[ 4.12667990e+00, 1.88898003e+00, 5.46392012e+00],
[ 2.46283007e+00, 7.42739975e-01, 5.22220993e+00],
[ 1.69474006e+00, 1.34199997e-02, 1.47354996e+00],
[ 3.89336991e+00, 7.28771019e+00, 2.28997993e+00],
[ 6.21744013e+00, 3.42370009e+00, 1.60818005e+00],
[ 6.68333006e+00, 4.00273991e+00, 2.97725010e+00],
[ 4.50516987e+00, 1.62489998e+00, 3.14264011e+00],
[ 3.33305001e+00, 1.38720006e-01, 4.14518023e+00],
[ 2.60663009e+00, 6.67814016e+00, 6.50474024e+00],
[ 2.77786994e+00, 7.44029999e-01, 4.91820008e-01],
[ 5.37717009e+00, 6.69230986e+00, 2.00269008e+00],
[ 5.59570980e+00, 4.83723021e+00, 4.56727982e+00],
[ 4.73746014e+00, 2.22565007e+00, 2.75880009e-01],
[ 3.92073011e+00, 3.30237007e+00, 6.73226976e+00],
[ 5.15031004e+00, 4.52967978e+00, 6.61049986e+00],
[ 5.92167997e+00, 4.81593990e+00, 6.31749988e-01],
[ 1.49842000e+00, 5.95369005e+00, 3.53689998e-01],
[ 6.97584009e+00, 6.45935011e+00, 8.98949981e-01],
[ 8.66829991e-01, 4.45000008e-02, 4.84080982e+00],
[ 1.99149996e-01, 1.31905997e+00, 5.42710018e+00],
[ 3.50265002e+00, 4.80391979e+00, 2.36179993e-01],
[ 1.58536005e+00, 5.62666988e+00, 5.81293011e+00],
[ 6.92963982e+00, 5.35320997e+00, 6.67840004e+00],
[ 6.93877983e+00, 3.84588003e+00, 5.63868999e+00],
[ 6.09070015e+00, 8.65360022e-01, 2.58961010e+00],
[ 6.96416998e+00, 1.27883005e+00, 1.35783005e+00]]])
cells = np.array([[[ 7.02786398, 0. , 0. ],
[ 0.14013857, 7.30525923, 0. ],
[-0.12944618, 0.04454867, 7.40261316]]])
label_energy = np.array([-490.48730469])

model = DeepPot("../model/H2O-PD.pt")
energy, force, virial = model.eval(coords, cells, atype)
print("Predict energy: %.5f"%(energy[0][0]))
print("Label energy: %.5f"%(label_energy[0]))
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
Predict energy: -490.54547
Label energy: -490.48730
代码
文本

Model Training

代码
文本

We're going to demonstrate how to perform singletask training, multitask training and finetuning based on pretrained model.

代码
文本

singletask

代码
文本
[4]
cd /root/src/train/singletask
/root/src/train/singletask
代码
文本

Here we use a small subset of the dataset H2O-PD_DPA_v1_0 to train a singletask model. The file input.json is the input script.

代码
文本
[9]
cat input.json
已隐藏输出
代码
文本

The parameters in input.json are the same as in the previous version of DeePMD-kit except for the descriptor part. For more details, you can go to our paper.

代码
文本

The training can be invoked by

代码
文本
[11]
!dp --pt train input.json
已隐藏输出
代码
文本

During the training, the error of the model is tested every disp_freq training steps. The training error and validation error are printed correspondingly in the file disp_file (default is lcurve.out). The batch size can be set in the input script by the key batch_size in the corresponding sections for the training and validation data set. An example of the output:

代码
文本
[12]
!head lcurve.out
已隐藏输出
代码
文本

The file contains 8 columns, from left to right, which are the training step, the validation loss, training loss, root mean square (RMS) validation error of energy, RMS training error of energy, RMS validation error of force, RMS training error of force and the learning rate. The RMS error (RMSE) of the energy is normalized by the number of atoms in the system. One can visualize this file with a simple Python script:

代码
文本
[13]
import numpy as np
import matplotlib.pyplot as plt

data = np.genfromtxt("lcurve.out", names=True)
for name in data.dtype.names[1:-1]:
plt.plot(data["step"], data[name], label=name)
plt.legend()
plt.xlabel("Step")
plt.ylabel("Loss")
plt.xscale("symlog")
plt.yscale("log")
plt.grid()
plt.show()
代码
文本

multitask

代码
文本
[5]
cd ../multitask
/root/src/train/multitask
代码
文本

Training on multiple data sets (each data set contains several data systems) can be performed in multi-task mode, with one common descriptor and multiple specific fitting nets for each data set. One needs to switch some parameters in training input script to perform multi-task mode including:

  • model –> model_dict, each key of which can be one individual fitting net.

  • training_data, validation_data –> data_dict, each key of which can be one individual data set contains several data systems for corresponding fitting net, the keys must be consistent with those in model_dict.

  • loss –> loss_dict, each key of which can be one individual loss setting for corresponding fitting net, the keys must be consistent with those in model_dict.

  • model_prob, each key of which can be a non-negative integer or float, deciding the chosen probability for corresponding fitting net in training.

代码
文本

Here we use three different datasets(a small subset of the dataset FerroEle_DPA_v1_0, a small subset of the dataset H2O-PD_DPA_v1_0, and a small subset of the dataset SemiCond_DPA_v1_0) to train a multitask model with three task heads.

代码
文本
[15]
cat input.json
已隐藏输出
代码
文本

The training procedure will automatically choose single-task or multi-task mode, based on the above parameters. The training can be invoked by

代码
文本
[16]
!dp --pt train input.json
已隐藏输出
代码
文本
[17]
!head lcurve.out
已隐藏输出
代码
文本

finetune

代码
文本
[6]
cd ../finetune
/root/src/train/finetune
代码
文本

Pretraining-and-finetuning is a widely used approach in other fields such as Computer Vision (CV) or Natural Language Processing (NLP) to vastly reduce the training cost, while it’s not trivial in potential models. Compositions and configurations of data samples or even computational parameters in upstream software (such as VASP) may be different between the pretrained and target datasets, leading to energy shifts or other diversities of training data.

The multitask training mode can overcome above difficulties. Our DPA-2 model can hopefully learn the common knowledge in the pretrained dataset and thus reduce the computational cost in downstream training tasks.

Here we have a pretrained multitask model multitask_model.pt on a large dataset (eighteen different datasets), a finetuning strategy can be performed by simply running:

代码
文本
[22]
!dp --pt train input.json --finetune ../../model/OpenLAM_2.2.0_27heads_beta3.pt --model-branch H2O_H2O-PD
已隐藏输出
代码
文本

The finetune procedure will inherit the neural network parameters of descriptor in pretrained multitask model. The fitting net can either reinit or inherit the fitting net from any branch of the pre-trained model depending on the argument -m.

  • -m (--model-branch): Model branch chosen for fine-tuning if multi-task. If not specified, it will re-init the fitting net.
代码
文本

Whether singletask mode, multitask mode or finetune mode, the training set contains H2O-PD, so we can compare the validation error on dataset H2O-PD directly using a python script

代码
文本
[23]
import numpy as np
import matplotlib.pyplot as plt

fig = plt.figure(figsize=(14,4))
data_singletask = np.genfromtxt("../singletask/lcurve.out", names=True)
data_multitask = np.genfromtxt("../multitask/lcurve.out", names=True)
data_finetune = np.genfromtxt("lcurve.out", names=True)
for idx,ii in enumerate([3,5]):
plt.subplot(1,3,idx+1)
for name in data_singletask.dtype.names[ii:ii+1]:
plt.plot(data_singletask["step"], data_singletask[name], label=f"singletask_{name}")
for name in data_finetune.dtype.names[ii:ii+1]:
plt.plot(data_finetune["step"], data_finetune[name], label=f"finetune_{name}")
for name in data_multitask.dtype.names[ii:ii+1]:
plt.plot(data_multitask["step"], data_multitask[name], label=f"multitask_{name}")
plt.legend()
plt.xlabel("Step")
plt.ylabel("Loss")
#plt.xscale("symlog")
plt.yscale("log")
plt.grid()
plt.show()
代码
文本

dp freeze

代码
文本

The .pth extension file for molecular dynamics simulations can be obtained by dp freeze.

代码
文本
[24]
!dp --pt freeze
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
[2024-09-14 11:42:54,054] DEEPMD INFO    DeePMD version: 3.0.0b3
代码
文本

Molecular Dynamics

代码
文本

The model can drive molecular dynamics in LAMMPS.

代码
文本
[7]
cd ../../md/water_192
/root/src/md/water_192
代码
文本
[8]
ls
data.dpa2*  dpa2_in.lammps*  dpa2_model.pth*
代码
文本

Here data.dpa2 gives the initial configuration of water MD simulation, and the file dpa2_in.lammps is the LAMMPS input script. One may check dpa2_in.lammps and finds that it is a rather standard LAMMPS input file for a MD simulation, with only two exception lines:

代码
文本
[9]
'''
# See https://deepmd.rtfd.io/lammps/ for usage
pair_style deepmd dpa2_model.pth
# If atom names (O H in this example) are not set in the pair_coeff command, the type_map defined by the training parameter will be used by default.
pair_coeff * * O H
'''
'\n# See https://deepmd.rtfd.io/lammps/ for usage\npair_style\tdeepmd dpa2_model.pth\n# If atom names (O H in this example) are not set in the pair_coeff command, the type_map defined by the training parameter will be used by default.\npair_coeff  * *\tO H\n'
代码
文本

where the pair style deepmd is invoked and the model file dpa2_model.pth is provided, which means the atomic interaction will be computed by the DPA-2 model that is stored in the file dpa2_model.pth.

代码
文本

In an environment with a compatible version of LAMMPS, the deep potential molecular dynamics can be performed via

代码
文本
[10]
!lmp -i dpa2_in.lammps
已隐藏输出
代码
文本

Distillation

代码
文本

Distillation can significantly improve the efficiency of finetuned models in MD simulations for production. Distillation requires DP-Gen2. For detail, you can refer to notebook https://bohrium.dp.tech/notebooks/62585747598 or https://bohrium.dp.tech/notebooks/76262686918

代码
文本

DP-Gen based on a DPA-2 pretrained model

代码
文本

Finetuning based on DPA-2 pretrained model can reduce the amount of data required for training. Running DP-Gen with a DPA-2 pretrained model can also save first-principles labelling. DP-Gen with DPA-2 requires DP-Gen2. For detail, you can refer to notebook https://bohrium.dp.tech/notebooks/62585747598 or https://bohrium.dp.tech/notebooks/76262686918

代码
文本

Tips

  1. Users are welcome to explore the DP Combo web server , which helps users automate operations such as model training and model distillation. Related notebook: DP Combo教程, 借助DP Combo一键丝滑生成半导体势函数 and 固态电解质实战 | DP Combo@APP体验

  2. Current DPA-2 model does not yet support features such as zbl, which we will implement in the near future. If you want to use these features, you can use the previous version of DeePMD-kit(github). Related notebook: DeePMD 使用教程、科研案例、问题收集合集

代码
文本
DeePMD-kit
DPA-2
DeePMD-kitDPA-2
已赞20
本文被以下合集收录
机器学习与DFT精华帖
gtang
更新于 2024-10-07
38 篇25 人关注
good notebooks collected by Taiping Hu
TaipingHu
更新于 2024-09-10
35 篇14 人关注
推荐阅读
公开
Hands-on to APEX (v1.2) on Bohrium
APEXWorkflowMaterialEnglishsimulation
APEXWorkflowMaterialEnglishsimulation
zhuoyli@connect.hku.hk
更新于 2024-08-08
4 赞6 转存文件
公开
asdfasdf
adf
adf
bulindog
发布于 2023-09-20
评论
 <a href="https://nb....

dfzshiwo@163.com

12-24 20:54
连接错了

2043899742@qq.com

作者
12-25 00:48
已改,谢谢老师
评论
 - data: This directo...

jianzhifu@vip.163.com

01-09 02:35
H2O_H2O-PD在哪个文件夹里?

2043899742@qq.com

作者
01-12 03:51
回复 jianzhifu@vip.163.com 是在src/data/H2O-PD_train和src/data/H2O-PD_valid文件夹下
评论
 We then use the mult...

cjxxjc729

02-03 22:49
what is the  task head?

Samoyezii

09-05 04:39
Available ones are ['Domains_Alloy', 'Domains_Anode', 'Domains_Cluster', 'Domains_Drug', 'Domains_FerroEle', 'Domains_OC2M', 'Domains_SSE-PBE', 'Domains_SemiCond', 'H2O_H2O-PD', 'Metals_AgAu-PBE', 'Metals_AlMgCu', 'Metals_Cu', 'Metals_Sn', 'Metals_Ti', 'Metals_V', 'Metals_W', 'Others_C12H26', 'Others_HfO2'].
展开
评论