Finetuning based on DPA-2 pretrained model can reduce the amount of data required for training. Running DP-Gen with a DPA-2 pretrained model can also save first-principles labelling. DP-Gen with DPA-2 requires DP-Gen2. First, install the latest version of DP-Gen2.
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting git+https://github.com/deepmodeling/dpgen2 Cloning https://github.com/deepmodeling/dpgen2 to /tmp/pip-req-build-_of6sxti Running command git clone --filter=blob:none --quiet https://github.com/deepmodeling/dpgen2 /tmp/pip-req-build-_of6sxti Resolved https://github.com/deepmodeling/dpgen2 to commit 8733ff57d441831a788ddfb7a47af5b73d217275 Installing build dependencies ... done Getting requirements to build wheel ... done Installing backend dependencies ... done Preparing metadata (pyproject.toml) ... -Collecting pydflow>=1.6.57 (from dpgen2==0.0.8.dev84+g8733ff5) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/81/ca/d69cc1204efeaa91bbdfe84a7d9f096e32c4331efa575448dfdb21866b9d/pydflow-1.8.61-py3-none-any.whl (159 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 159.5/159.5 kB 19.1 MB/s eta 0:00:00 Requirement already satisfied: dargs>=0.3.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgen2==0.0.8.dev84+g8733ff5) (0.4.4) Requirement already satisfied: scipy in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgen2==0.0.8.dev84+g8733ff5) (1.12.0) Requirement already satisfied: lbg in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgen2==0.0.8.dev84+g8733ff5) (1.2.24) Requirement already satisfied: packaging in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgen2==0.0.8.dev84+g8733ff5) (23.2) Collecting fpop (from dpgen2==0.0.8.dev84+g8733ff5) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/3b/0e/8cf4fa4c1abd303cbe42c5fb345dd1dd866f7b2a5ea7b59c911c9a7d1e79/fpop-0.0.7-py3-none-any.whl (32 kB) Collecting dpgui (from dpgen2==0.0.8.dev84+g8733ff5) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f1/b7/d277585dd8868f4dd7c623a07b658aec8ccc9c8adfd8e0180615372ec0de/dpgui-1.0.0-py3-none-any.whl (2.1 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 39.2 MB/s eta 0:00:00a 0:00:01 Requirement already satisfied: typeguard>=4 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dargs>=0.3.1->dpgen2==0.0.8.dev84+g8733ff5) (4.1.5) Requirement already satisfied: six in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (1.16.0) Requirement already satisfied: python-dateutil in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (2.9.0) Requirement already satisfied: urllib3 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (2.2.1) Requirement already satisfied: certifi in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (2024.2.2) Collecting argo-workflows==5.0.0 (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b1/6a/8f13d5124b111e8e054594d23782ea9c5dadda0517d1dd9ad08c7c055732/argo_workflows-5.0.0-py3-none-any.whl (452 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 452.5/452.5 kB 43.3 MB/s eta 0:00:00 Collecting jsonpickle (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/19/c3/453e4e2da82d5efad9e653916a120d94daf5062f7eae43e28f39fff1bc6a/jsonpickle-3.0.4-py3-none-any.whl (39 kB) Collecting minio (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/45/a4/6f278051ad2bc03f3a0fdb4e182c9529009b0357631c2bb7c6ae70b4b0f6/minio-7.2.5-py3-none-any.whl (93 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 93.2/93.2 kB 17.5 MB/s eta 0:00:00 Collecting kubernetes (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6f/34/164e57fec8a9693d7e6ae2d1a345482020ea9e9b32eab95a90bb3eaea83d/kubernetes-29.0.0-py2.py3-none-any.whl (1.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 55.6 MB/s eta 0:00:00 Requirement already satisfied: pyyaml in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (6.0.1) Collecting cloudpickle==2.2.0 (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/cf/26/cd6c4177273ee35f7a31245893489c68bc340988f12ca315b392f1f18a93/cloudpickle-2.2.0-py3-none-any.whl (25 kB) Requirement already satisfied: requests in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (2.31.0) Requirement already satisfied: tqdm in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (4.66.2) Requirement already satisfied: psutil in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (5.9.8) Requirement already satisfied: monty in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata->dpgen2==0.0.8.dev84+g8733ff5) (2024.2.26) Requirement already satisfied: h5py in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata->dpgen2==0.0.8.dev84+g8733ff5) (3.10.0) Requirement already satisfied: wcmatch in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata->dpgen2==0.0.8.dev84+g8733ff5) (8.5) Collecting waitress (from dpgui->dpgen2==0.0.8.dev84+g8733ff5) Downloading https://pypi.tuna.tsinghua.edu.cn/packages/5b/a9/485c953a1ac4cb98c28e41fd2c7184072df36bbf99734a51d44d04176878/waitress-3.0.0-py3-none-any.whl (56 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.7/56.7 kB 10.7 MB/s eta 0:00:00 Requirement already satisfied: werkzeug in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgui->dpgen2==0.0.8.dev84+g8733ff5) (3.0.1) Requirement already satisfied: oss2 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.18.4) Requirement already satisfied: requests-toolbelt in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (1.0.0) Requirement already satisfied: aliyun-python-sdk-core in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.14.0) Requirement already satisfied: aliyun-python-sdk-kms in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.16.2) Requirement already satisfied: aliyun-python-sdk-sts in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (3.1.2) Requirement already satisfied: pytimeparse in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (1.1.8) Requirement already satisfied: pandas in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.2.1) Requirement already satisfied: colorama in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (0.4.6) Requirement already satisfied: readchar in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (4.0.5) Requirement already satisfied: pyreadline in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.1) Requirement already satisfied: pyreadline3 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (3.4.1) Requirement already satisfied: validators in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (0.22.0) Requirement already satisfied: pyhumps in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (3.8.0) Requirement already satisfied: argcomplete in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (3.2.2) Requirement already satisfied: typing-extensions>=4.7.0 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from typeguard>=4->dargs>=0.3.1->dpgen2==0.0.8.dev84+g8733ff5) (4.10.0) Requirement already satisfied: jmespath<1.0.0,>=0.9.3 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from aliyun-python-sdk-core->lbg->dpgen2==0.0.8.dev84+g8733ff5) (0.10.0) Requirement already satisfied: cryptography>=2.6.0 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from aliyun-python-sdk-core->lbg->dpgen2==0.0.8.dev84+g8733ff5) (42.0.5) Requirement already satisfied: google-auth>=1.0.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (2.28.1) Requirement already satisfied: websocket-client!=0.40.0,!=0.41.*,!=0.42.*,>=0.32.0 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (1.7.0) Requirement already satisfied: requests-oauthlib in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (1.3.1) Requirement already satisfied: oauthlib>=3.2.2 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (3.2.2) Requirement already satisfied: argon2-cffi in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from minio->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (23.1.0) Requirement already satisfied: pycryptodome in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from minio->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (3.20.0) Requirement already satisfied: crcmod>=1.7 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from oss2->lbg->dpgen2==0.0.8.dev84+g8733ff5) (1.7) Requirement already satisfied: charset-normalizer<4,>=2 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from requests->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from requests->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (3.6) Requirement already satisfied: pytz>=2020.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pandas->lbg->dpgen2==0.0.8.dev84+g8733ff5) (2024.1) Requirement already satisfied: tzdata>=2022.7 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pandas->lbg->dpgen2==0.0.8.dev84+g8733ff5) (2024.1) Requirement already satisfied: setuptools>=41.0 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from readchar->lbg->dpgen2==0.0.8.dev84+g8733ff5) (69.1.1) Requirement already satisfied: bracex>=2.1.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from wcmatch->dpdata->dpgen2==0.0.8.dev84+g8733ff5) (2.2.1) Requirement already satisfied: MarkupSafe>=2.1.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from werkzeug->dpgui->dpgen2==0.0.8.dev84+g8733ff5) (2.1.5) Requirement already satisfied: cffi>=1.12 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from cryptography>=2.6.0->aliyun-python-sdk-core->lbg->dpgen2==0.0.8.dev84+g8733ff5) (1.16.0) Requirement already satisfied: cachetools<6.0,>=2.0.0 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (5.3.3) Requirement already satisfied: pyasn1-modules>=0.2.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (0.3.0) Requirement already satisfied: rsa<5,>=3.1.4 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (4.9) Requirement already satisfied: argon2-cffi-bindings in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from argon2-cffi->minio->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (21.2.0) Requirement already satisfied: pycparser in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from cffi>=1.12->cryptography>=2.6.0->aliyun-python-sdk-core->lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.21) Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth>=1.0.1->kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (0.5.1) Building wheels for collected packages: dpgen2 Building wheel for dpgen2 (pyproject.toml) ... done Created wheel for dpgen2: filename=dpgen2-0.0.8.dev84+g8733ff5-py3-none-any.whl size=137288 sha256=93ace2c7228d0c87c3affead70d8aedd37899ee6113ac3b0772248cad6abb38c Stored in directory: /tmp/pip-ephem-wheel-cache-jepqed2_/wheels/97/ff/bf/0a0da3c722e0e1e39ca0f03f5fdc69ee7805c3330ee08c81b1 Successfully built dpgen2 Installing collected packages: argo-workflows, waitress, jsonpickle, cloudpickle, kubernetes, dpgui, minio, pydflow, fpop, dpgen2 Attempting uninstall: cloudpickle Found existing installation: cloudpickle 3.0.0 Uninstalling cloudpickle-3.0.0: Successfully uninstalled cloudpickle-3.0.0 Successfully installed argo-workflows-5.0.0 cloudpickle-2.2.0 dpgen2-0.0.8.dev84+g8733ff5 dpgui-1.0.0 fpop-0.0.7 jsonpickle-3.0.4 kubernetes-29.0.0 minio-7.2.5 pydflow-1.8.61 waitress-3.0.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: dpdata in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (0.2.17) Collecting dpdata Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e8/22/d81cdd3fe3a936a705745730f2fbd2587f8bcb67ef7cca5ad4164a5a239c/dpdata-0.2.18-py3-none-any.whl (148 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 148.3/148.3 kB 5.2 MB/s eta 0:00:00 Requirement already satisfied: numpy>=1.14.3 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata) (1.26.4) Requirement already satisfied: monty in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata) (2024.2.26) Requirement already satisfied: scipy in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata) (1.12.0) Requirement already satisfied: h5py in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata) (3.10.0) Requirement already satisfied: wcmatch in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata) (8.5) Requirement already satisfied: bracex>=2.1.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from wcmatch->dpdata) (2.2.1) Installing collected packages: dpdata Attempting uninstall: dpdata Found existing installation: dpdata 0.2.17 Uninstalling dpdata-0.2.17: Successfully uninstalled dpdata-0.2.17 Successfully installed dpdata-0.2.18 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
This example provides pretrained model, the initial training data of water and PBE functional files required for VASP calculations in the dataset. Link them into the working directory.
First, we need initial training data. Then we prepare the initial configurations for MD exploration. Here we sample 100 configurations randomly from the training data.
Below we will prepare the input file for DP-Gen2. You can specify a name for the workflow in the field name
. By default, the workflow server https://workflows.deepmodeling.com is used. In bohrium_config
, fill in your Bohrium username, password, and project ID. Specify init_data_sys
with the list of system paths for the initial training data. The training, exploration and first-principle sections each require input file templates for DP, LAMMPS and VASP, which will be provided later. In train
, paths of the pretrained models are required for init_models_paths
. Here we provide 4 identical paths. In explore
, configurations
should be passed with the initial configuration files we just prepared. stages
specifies the settings for MD simulations, n_sample
determines how many configurations to sample from the initial configurations per iteration, and revisions
specifies the values of the variables in the LAMMPS input file template. Each variable's value can be a list, and the final combinations are the Cartesian product of all lists. For more usage of parameters, please refer to the documentation at https://docs.deepmodeling.com/projects/dpgen2/en/latest/.
Overwriting input.json
Here is a simple LAMMPS input template for NVT simulations, where the number of steps, temperature, and output frequency are provided as variables.
Writing template.lammps
This is a DP training input template for DPA-2.
Writing train.json
Here is a VASP input template
Writing INCAR
Finally, submit the DP-Gen workflow
Workflow has been submitted (ID: water-dpgen-5mwgq, UID: 43c1bcb7-2d25-4612-8f72-cf84e6dbdf1f) Workflow link: https://workflows.deepmodeling.com/workflows/argo/water-dpgen-5mwgq
The progress of the workflow can be tracked through the link printed above. The metrics for each iteration of DP-Gen can be obtained through the dpgen2
command line
WARNING:root:Exploration scheduler not found in the global outputs WARNING:root:no scheduler is finished
zhb
AndyX