Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
Distillation of DPA-2 models
DPA
DPA
zjgemi
2043899742@qq.com
发布于 2023-12-19
推荐镜像 :deepmd-kit:stable-0411
推荐机型 :c12_m92_1 * NVIDIA V100
赞 4
6
12
dpa2-finetune-water-example(v3)

Distillation can significantly improve the efficiency of finetuned models in MD simulations for production. Distillation requires DP-Gen2. First, install the latest version of DP-Gen2.

代码
文本
[1]
!pip install git+https://github.com/deepmodeling/dpgen2
!pip install -U dpdata
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting git+https://github.com/deepmodeling/dpgen2
  Cloning https://github.com/deepmodeling/dpgen2 to /tmp/pip-req-build-9y8oe97g
  Running command git clone --filter=blob:none --quiet https://github.com/deepmodeling/dpgen2 /tmp/pip-req-build-9y8oe97g
  Resolved https://github.com/deepmodeling/dpgen2 to commit 8733ff57d441831a788ddfb7a47af5b73d217275
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgen2==0.0.8.dev84+g8733ff5) (1.26.4)
Requirement already satisfied: dpdata in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgen2==0.0.8.dev84+g8733ff5) (0.2.17)
Collecting pydflow>=1.6.57 (from dpgen2==0.0.8.dev84+g8733ff5)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/81/ca/d69cc1204efeaa91bbdfe84a7d9f096e32c4331efa575448dfdb21866b9d/pydflow-1.8.61-py3-none-any.whl (159 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 159.5/159.5 kB 5.9 MB/s eta 0:00:00
Requirement already satisfied: dargs>=0.3.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgen2==0.0.8.dev84+g8733ff5) (0.4.4)
Requirement already satisfied: scipy in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgen2==0.0.8.dev84+g8733ff5) (1.12.0)
Requirement already satisfied: lbg in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgen2==0.0.8.dev84+g8733ff5) (1.2.24)
Requirement already satisfied: packaging in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgen2==0.0.8.dev84+g8733ff5) (23.2)
Collecting fpop (from dpgen2==0.0.8.dev84+g8733ff5)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/3b/0e/8cf4fa4c1abd303cbe42c5fb345dd1dd866f7b2a5ea7b59c911c9a7d1e79/fpop-0.0.7-py3-none-any.whl (32 kB)
Collecting dpgui (from dpgen2==0.0.8.dev84+g8733ff5)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/f1/b7/d277585dd8868f4dd7c623a07b658aec8ccc9c8adfd8e0180615372ec0de/dpgui-1.0.0-py3-none-any.whl (2.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 5.4 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: typeguard>=4 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dargs>=0.3.1->dpgen2==0.0.8.dev84+g8733ff5) (4.1.5)
Requirement already satisfied: six in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (1.16.0)
Requirement already satisfied: python-dateutil in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (2.9.0)
Requirement already satisfied: urllib3 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (2.2.1)
Requirement already satisfied: certifi in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (2024.2.2)
Collecting argo-workflows==5.0.0 (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/b1/6a/8f13d5124b111e8e054594d23782ea9c5dadda0517d1dd9ad08c7c055732/argo_workflows-5.0.0-py3-none-any.whl (452 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 452.5/452.5 kB 7.0 MB/s eta 0:00:00a 0:00:01
Collecting jsonpickle (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/19/c3/453e4e2da82d5efad9e653916a120d94daf5062f7eae43e28f39fff1bc6a/jsonpickle-3.0.4-py3-none-any.whl (39 kB)
Collecting minio (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/45/a4/6f278051ad2bc03f3a0fdb4e182c9529009b0357631c2bb7c6ae70b4b0f6/minio-7.2.5-py3-none-any.whl (93 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 93.2/93.2 kB 3.5 MB/s eta 0:00:00
Collecting kubernetes (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6f/34/164e57fec8a9693d7e6ae2d1a345482020ea9e9b32eab95a90bb3eaea83d/kubernetes-29.0.0-py2.py3-none-any.whl (1.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 5.2 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: pyyaml in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (6.0.1)
Collecting cloudpickle==2.2.0 (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/cf/26/cd6c4177273ee35f7a31245893489c68bc340988f12ca315b392f1f18a93/cloudpickle-2.2.0-py3-none-any.whl (25 kB)
Requirement already satisfied: requests in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (2.31.0)
Requirement already satisfied: tqdm in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (4.66.2)
Requirement already satisfied: psutil in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (5.9.8)
Requirement already satisfied: monty in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata->dpgen2==0.0.8.dev84+g8733ff5) (2024.2.26)
Requirement already satisfied: h5py in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata->dpgen2==0.0.8.dev84+g8733ff5) (3.10.0)
Requirement already satisfied: wcmatch in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata->dpgen2==0.0.8.dev84+g8733ff5) (8.5)
Collecting waitress (from dpgui->dpgen2==0.0.8.dev84+g8733ff5)
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/5b/a9/485c953a1ac4cb98c28e41fd2c7184072df36bbf99734a51d44d04176878/waitress-3.0.0-py3-none-any.whl (56 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.7/56.7 kB 7.9 MB/s eta 0:00:00
Requirement already satisfied: werkzeug in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpgui->dpgen2==0.0.8.dev84+g8733ff5) (3.0.1)
Requirement already satisfied: oss2 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.18.4)
Requirement already satisfied: requests-toolbelt in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (1.0.0)
Requirement already satisfied: aliyun-python-sdk-core in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.14.0)
Requirement already satisfied: aliyun-python-sdk-kms in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.16.2)
Requirement already satisfied: aliyun-python-sdk-sts in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (3.1.2)
Requirement already satisfied: pytimeparse in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (1.1.8)
Requirement already satisfied: pandas in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.2.1)
Requirement already satisfied: colorama in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (0.4.6)
Requirement already satisfied: readchar in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (4.0.5)
Requirement already satisfied: pyreadline in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.1)
Requirement already satisfied: pyreadline3 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (3.4.1)
Requirement already satisfied: validators in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (0.22.0)
Requirement already satisfied: pyhumps in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (3.8.0)
Requirement already satisfied: argcomplete in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from lbg->dpgen2==0.0.8.dev84+g8733ff5) (3.2.2)
Requirement already satisfied: typing-extensions>=4.7.0 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from typeguard>=4->dargs>=0.3.1->dpgen2==0.0.8.dev84+g8733ff5) (4.10.0)
Requirement already satisfied: jmespath<1.0.0,>=0.9.3 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from aliyun-python-sdk-core->lbg->dpgen2==0.0.8.dev84+g8733ff5) (0.10.0)
Requirement already satisfied: cryptography>=2.6.0 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from aliyun-python-sdk-core->lbg->dpgen2==0.0.8.dev84+g8733ff5) (42.0.5)
Requirement already satisfied: google-auth>=1.0.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (2.28.1)
Requirement already satisfied: websocket-client!=0.40.0,!=0.41.*,!=0.42.*,>=0.32.0 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (1.7.0)
Requirement already satisfied: requests-oauthlib in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (1.3.1)
Requirement already satisfied: oauthlib>=3.2.2 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (3.2.2)
Requirement already satisfied: argon2-cffi in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from minio->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (23.1.0)
Requirement already satisfied: pycryptodome in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from minio->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (3.20.0)
Requirement already satisfied: crcmod>=1.7 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from oss2->lbg->dpgen2==0.0.8.dev84+g8733ff5) (1.7)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from requests->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from requests->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (3.6)
Requirement already satisfied: pytz>=2020.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pandas->lbg->dpgen2==0.0.8.dev84+g8733ff5) (2024.1)
Requirement already satisfied: tzdata>=2022.7 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pandas->lbg->dpgen2==0.0.8.dev84+g8733ff5) (2024.1)
Requirement already satisfied: setuptools>=41.0 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from readchar->lbg->dpgen2==0.0.8.dev84+g8733ff5) (69.1.1)
Requirement already satisfied: bracex>=2.1.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from wcmatch->dpdata->dpgen2==0.0.8.dev84+g8733ff5) (2.2.1)
Requirement already satisfied: MarkupSafe>=2.1.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from werkzeug->dpgui->dpgen2==0.0.8.dev84+g8733ff5) (2.1.5)
Requirement already satisfied: cffi>=1.12 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from cryptography>=2.6.0->aliyun-python-sdk-core->lbg->dpgen2==0.0.8.dev84+g8733ff5) (1.16.0)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (5.3.3)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (0.3.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from google-auth>=1.0.1->kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (4.9)
Requirement already satisfied: argon2-cffi-bindings in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from argon2-cffi->minio->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (21.2.0)
Requirement already satisfied: pycparser in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from cffi>=1.12->cryptography>=2.6.0->aliyun-python-sdk-core->lbg->dpgen2==0.0.8.dev84+g8733ff5) (2.21)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth>=1.0.1->kubernetes->pydflow>=1.6.57->dpgen2==0.0.8.dev84+g8733ff5) (0.5.1)
Building wheels for collected packages: dpgen2
  Building wheel for dpgen2 (pyproject.toml) ... done
  Created wheel for dpgen2: filename=dpgen2-0.0.8.dev84+g8733ff5-py3-none-any.whl size=137288 sha256=0c1f3f6327baa0838ef2b359e4ee89a71219f2be9f3e0f030bd6706216af66d5
  Stored in directory: /tmp/pip-ephem-wheel-cache-1wbjndqy/wheels/97/ff/bf/0a0da3c722e0e1e39ca0f03f5fdc69ee7805c3330ee08c81b1
Successfully built dpgen2
Installing collected packages: argo-workflows, waitress, jsonpickle, cloudpickle, kubernetes, dpgui, minio, pydflow, fpop, dpgen2
  Attempting uninstall: cloudpickle
    Found existing installation: cloudpickle 3.0.0
    Uninstalling cloudpickle-3.0.0:
      Successfully uninstalled cloudpickle-3.0.0
Successfully installed argo-workflows-5.0.0 cloudpickle-2.2.0 dpgen2-0.0.8.dev84+g8733ff5 dpgui-1.0.0 fpop-0.0.7 jsonpickle-3.0.4 kubernetes-29.0.0 minio-7.2.5 pydflow-1.8.61 waitress-3.0.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: dpdata in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (0.2.17)
Collecting dpdata
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e8/22/d81cdd3fe3a936a705745730f2fbd2587f8bcb67ef7cca5ad4164a5a239c/dpdata-0.2.18-py3-none-any.whl (148 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 148.3/148.3 kB 543.4 kB/s eta 0:00:00a 0:00:01
Requirement already satisfied: numpy>=1.14.3 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata) (1.26.4)
Requirement already satisfied: monty in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata) (2024.2.26)
Requirement already satisfied: scipy in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata) (1.12.0)
Requirement already satisfied: h5py in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata) (3.10.0)
Requirement already satisfied: wcmatch in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from dpdata) (8.5)
Requirement already satisfied: bracex>=2.1.1 in /opt/deepmd-kit-3.0.0/lib/python3.10/site-packages (from wcmatch->dpdata) (2.2.1)
Installing collected packages: dpdata
  Attempting uninstall: dpdata
    Found existing installation: dpdata 0.2.17
    Uninstalling dpdata-0.2.17:
      Successfully uninstalled dpdata-0.2.17
Successfully installed dpdata-0.2.18
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
代码
文本

This example provides finetuned model and the training and validation data used for finetuning in the dataset. Link them into the working directory.

代码
文本
[2]
!ln -s /bohr/dpa2-finetune-example-water-jbhv/v3/finetuned_model.pt teacher_model.pt
!ln -s /bohr/dpa2-finetune-example-water-jbhv/v3/H2O-PBE0TS-MD/train train
!ln -s /bohr/dpa2-finetune-example-water-jbhv/v3/H2O-PBE0TS-MD/valid valid
代码
文本

Then we prepare the initial data for the DP training, i.e. use the finetuned model to label on some data, e.g. the training data used for finetuning.

代码
文本
[3]
import dpdata
import numpy as np
import os
from deepmd.infer import DeepPot
from pathlib import Path
from tqdm import tqdm
from typing import List, Optional, Tuple

class DPPTPredict:
def load_model(self, model: Path):
self.dp = DeepPot(model)

def evaluate(self,
coord: np.ndarray,
cell: Optional[np.ndarray],
atype: List[int]
) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
e, f, v = self.dp.eval(coord, cell, atype)
return e.reshape([-1]), f, v.reshape([-1,3,3])

def predict(self, input_path="input", output_path="output"):
type_map = self.dp.get_type_map()
for f in tqdm(list(Path(input_path).rglob("type.raw"))):
sys = f.parent
print(sys)
d = dpdata.MultiSystems()
mixed_type = len(list(sys.glob("*/real_atom_types.npy"))) > 0
if mixed_type:
d.load_systems_from_file(sys, fmt="deepmd/npy/mixed")
else:
k = dpdata.LabeledSystem(sys, fmt="deepmd/npy")
d.append(k)
for k in d:
anames = k["atom_names"]
ori_atype = k["atom_types"]
atype = np.array([type_map.index(anames[j]) for j in ori_atype])
e, f, v = self.evaluate(k["coords"], k["cells"], atype)
k.data["energies"] = e
k.data["forces"] = f
k.data["virials"] = v
# For configurations in DP-Gen2 only accept 1-level dir
out_dir = os.path.join(output_path, str(sys.relative_to(input_path)).replace("/", "_"))
if len(d) == 1:
d[0].to_deepmd_npy_mixed(out_dir)
else:
# The multisystem is loaded from one dir, thus we can safely keep one dir
d.to_deepmd_npy_mixed(out_dir + ".tmp")
fs = os.listdir(out_dir + ".tmp")
assert len(fs) == 1
os.rename(os.path.join(out_dir + ".tmp", fs[0]), out_dir)
os.rmdir(out_dir + ".tmp")

d = DPPTPredict()
d.load_model('teacher_model.pt')
d.predict('train', 'train_predict')
d.predict('valid', 'valid_predict')
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
  0%|          | 0/1 [00:00<?, ?it/s]train
/opt/deepmd-kit-3.0.0/lib/python3.10/site-packages/torch/nn/modules/module.py:1501: UserWarning: operator() sees varying value in profiling, ignoring and this should be handled by GUARD logic (Triggered internally at /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1680572619157/work/third_party/nvfuser/csrc/parser.cpp:3777.)
  return forward_call(*args, **kwargs)
100%|██████████| 1/1 [02:03<00:00, 123.94s/it]
  0%|          | 0/1 [00:00<?, ?it/s]valid
100%|██████████| 1/1 [00:07<00:00,  7.48s/it]
代码
文本

Then we prepare the initial configurations for MD exploration. Here we sample 100 configurations randomly from the training data.

代码
文本
[4]
import random

n_select = 100
m = dpdata.MultiSystems()
m.load_systems_from_file("train_predict", fmt="deepmd/npy/mixed")
if m.get_nframes() <= n_select:
os.symlink("train_predict", "init")
else:
ratio = n_select / m.get_nframes()
new = dpdata.MultiSystems()
for s in m:
n = int(len(s)*ratio)
if random.random() < len(s)*ratio - n:
n += 1
if n > 0:
new.append(s.sub_system(random.sample(range(len(s)), n)))
new.to_deepmd_npy_mixed("init")
代码
文本

Below we will prepare the input file for DP-Gen2. You can specify a name for the workflow in the field name. By default, the workflow server https://workflows.deepmodeling.com is used. In bohrium_config, fill in your Bohrium username, password, and project ID. The type_map in the inputs field determines the type map for the final distilled model, and mass_map for the corresponding masses. Specify init_data_sys with the list of system paths for the initial training data we just prepared. valid_data_sys is optional which can be the system paths for the validation data. The training and exploration sections each require input file templates for DP and LAMMPS, which will be provided later. In explore, configurations should be passed with the initial configuration files we just prepared. stages specifies the settings for MD simulations, n_sample determines how many configurations to sample from the initial configurations per iteration, and revisions specifies the values of the variables in the LAMMPS input file template. Each variable's value can be a list, and the final combinations are the Cartesian product of all lists. For more usage of parameters, please refer to the documentation at https://docs.deepmodeling.com/projects/dpgen2/en/latest/.

代码
文本
[9]
%%file input.json
{
"name": "water-distill",
"bohrium_config": {
"username": "<your-bohrium-username>",
"password": "<your-bohrium-password>",
"project_id": "<your-bohrium-project-id>",
"_comment": "all"
},
"default_step_config": {
"template_config": {
"image": "registry.dp.tech/dptech/prod-11881/dpgen2-utils:1.2",
"_comment": "all"
},
"_comment": "all"
},
"step_configs": {
"run_train_config": {
"template_config": {
"image": "registry.dp.tech/dptech/deepmd-kit:2.2.7-cuda11.6",
"_comment": "all"
},
"executor": {
"type": "dispatcher",
"retry_on_submission_error": 10,
"image_pull_policy": "IfNotPresent",
"machine_dict": {
"batch_type": "Bohrium",
"context_type": "Bohrium",
"remote_profile": {
"input_data": {
"job_type": "container",
"platform": "ali",
"scass_type": "1 * NVIDIA V100_16g"
}
}
}
},
"_comment": "all"
},
"run_explore_config": {
"template_config": {
"image": "registry.dp.tech/dptech/deepmd-kit:2.2.7-cuda11.6",
"_comment": "all"
},
"continue_on_success_ratio": 0.80,
"executor": {
"type": "dispatcher",
"retry_on_submission_error": 10,
"image_pull_policy": "IfNotPresent",
"machine_dict": {
"batch_type": "Bohrium",
"context_type": "Bohrium",
"remote_profile": {
"input_data": {
"job_type": "container",
"platform": "ali",
"scass_type": "1 * NVIDIA V100_16g"
}
}
}
},
"template_slice_config": {
"group_size": 1,
"pool_size": 1
},
"_comment": "all"
},
"run_fp_config": {
"template_config": {
"image": "registry.dp.tech/dptech/prod-11106/deepmd-kit:stable-0411",
"_comment": "all"
},
"continue_on_success_ratio": 0.80,
"executor": {
"type": "dispatcher",
"retry_on_submission_error": 10,
"image_pull_policy": "IfNotPresent",
"machine_dict": {
"batch_type": "Bohrium",
"context_type": "Bohrium",
"remote_profile": {
"input_data": {
"job_type": "container",
"platform": "ali",
"on_demand": 1,
"scass_type": "c12_m92_1 * NVIDIA V100"
}
}
}
},
"template_slice_config": {
"group_size": 100,
"pool_size": 1
},
"_comment": "all"
},
"_comment": "all"
},
"upload_python_packages": [
"/opt/deepmd-kit-3.0.0/lib/python3.10/site-packages/dpgen2",
"/opt/deepmd-kit-3.0.0/lib/python3.10/site-packages/dpdata"
],
"inputs": {
"type_map": [
"O",
"H"
],
"mixed_type": true,
"mass_map": [
16.0,
4.0
],
"init_data_prefix": null,
"init_data_sys": [
"train_predict"
],
"valid_data_sys": [
"valid_predict"
],
"_comment": "all"
},
"train": {
"type": "dp",
"numb_models": 4,
"config": {
"init_model_policy": "yes",
"init_model_old_ratio": 0.90,
"init_model_numb_steps": 500000,
"init_model_start_lr": 1e-4,
"init_model_start_pref_e": 0.25,
"init_model_start_pref_f": 100,
"_comment": "all"
},
"template_script": "train.json",
"_comment": "all"
},
"explore": {
"type": "lmp",
"config": {
"command": "lmp -var restart 0"
},
"convergence": {
"type": "adaptive-lower",
"conv_tolerance": 0.005,
"_numb_candi_f": 3000,
"rate_candi_f": 0.15,
"level_f_hi": 0.5,
"n_checked_steps": 8,
"_command": "all"
},
"max_numb_iter": 16,
"fatal_at_max": false,
"configuration_prefix": null,
"configurations": [
{
"type": "file",
"files": [
"init"
],
"fmt": "deepmd/npy/mixed"
}
],
"stages": [
[
{
"type": "lmp-template",
"lmp": "template.lammps",
"trj_freq": 100,
"revisions": {
"V_NSTEPS": [
10000
],
"V_TEMP": [
330
],
"V_DUMPFREQ": [
200
]
},
"sys_idx": [
0
],
"n_sample": 4
}
]
],
"_comment": "all"
},
"fp": {
"type": "deepmd",
"task_max": 4000,
"run_config" : {
"teacher_model_path": "teacher_model.pt"
},
"inputs_config": {},
"_comment": "all"
}
}
Overwriting input.json
代码
文本

Here is a simple LAMMPS input template for NVT simulations, where the number of steps, temperature, and output frequency are provided as variables.

代码
文本
双击即可修改
代码
文本
[6]
%%file template.lammps
variable NSTEPS equal V_NSTEPS
variable TEMP equal V_TEMP
variable THERMO_FREQ equal V_DUMPFREQ
variable TAU_T equal 0.100000

# Initialization
units metal
dimension 3
atom_style atomic
boundary p p p

read_data conf.lmp
mass 1 16.0
mass 2 4.0

# Interatomic potentials - DeepMD
pair_style deepmd
pair_coeff * *

timestep 0.001 # ps
velocity all create ${TEMP} 1815191 mom yes rot yes dist gaussian

run_style verlet
fix 1 all nvt temp ${TEMP} ${TEMP} ${TAU_T}
thermo_style custom step temp pe etotal press
thermo ${THERMO_FREQ} # Ouput thermodynamic properties
dump dpgen_dump
run ${NSTEPS}
Writing template.lammps
代码
文本

This is a DP training input template for distilled model (DPA-1 without attention layer)

代码
文本
[7]
%%file train.json
{
"model": {
"type_map": [
"O",
"H"
],
"descriptor": {
"type": "se_atten_v2",
"sel": 120,
"rcut_smth": 0.5,
"rcut": 6.0,
"neuron": [
25,
50,
100
],
"resnet_dt": false,
"axis_neuron": 16,
"seed": 1,
"attn": 128,
"attn_layer": 0,
"attn_dotr": true,
"attn_mask": false,
"_comment": " that's all"
},
"fitting_net": {
"neuron": [
240,
240,
240
],
"resnet_dt": true,
"seed": 1,
"_comment": " that's all"
},
"_comment": " that's all"
},
"learning_rate": {
"type": "exp",
"decay_steps": 5000,
"start_lr": 0.001,
"stop_lr": 3.51e-08,
"_comment": "that's all"
},
"loss": {
"type": "ener",
"start_pref_e": 0.02,
"limit_pref_e": 1,
"start_pref_f": 1000,
"limit_pref_f": 1,
"start_pref_v": 0,
"limit_pref_v": 0,
"_comment": " that's all"
},
"training": {
"training_data": {
"systems": [],
"batch_size": "auto",
"_comment": "that's all"
},
"validation_data": {
"systems": [],
"batch_size": 1,
"numb_btch": 3,
"_comment": "that's all"
},
"numb_steps": 1000000,
"seed": 10,
"disp_file": "lcurve.out",
"disp_freq": 100,
"save_freq": 1000,
"_comment": "that's all"
},
"_comment": "that's all"
}
Writing train.json
代码
文本

Finally, submit the distillation workflow

代码
文本
[10]
!dpgen2 submit input.json
Workflow has been submitted (ID: water-distill-kk4wr, UID: b1de6386-1191-4049-9758-70e59616b286)
Workflow link: https://workflows.deepmodeling.com/workflows/argo/water-distill-kk4wr
代码
文本

The progress of the workflow can be tracked through the link printed above. The metrics for each iteration of distillation can be obtained through the dpgen2 command line

代码
文本
[6]
!dpgen2 status input.json water-distill-zzlmx
100%|█████████████████████████████████████████████| 1/1 [00:00<00:00,  3.47it/s]
#   stage  id_stg.    iter.      accu.      cand.      fail.   lvl_f_lo lvl_f_hi
# Stage    0  --------------------
        0        0        0     0.8444     0.1489     0.0067     0.0491   0.5000
        0        1        1     0.8500     0.1500     0.0000     0.0458   0.5000
        0        2        2     0.8500     0.1500     0.0000     0.0449   0.5000
        0        3        3     0.8499     0.1499     0.0002     0.0438   0.5000
        0        4        4     0.8500     0.1500     0.0000     0.0439   0.5000
        0        5        5     0.8499     0.1499     0.0002     0.0427   0.5000
        0        6        6     0.8500     0.1500     0.0000     0.0412   0.5000

代码
文本

To test the accuracy of the distilled model, you can download models from a specific iteration

代码
文本
[7]
!dpgen2 download input.json water-distill-zzlmx -i 6 -d prep-run-train/output/models
100%|█████████████████████████████████████████████| 9/9 [00:01<00:00,  8.15it/s]
INFO:root:iter-000006/prep-run-train/output/models downloaded
代码
文本

Then use command dp test to test the model

代码
文本
[8]
!dp test -m iter-000006/prep-run-train/output/models/task.0000/frozen_model.pb -s valid_predict
2023-12-21 14:22:03.164163: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2023-12-21 14:22:03.552263: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-12-21 14:22:03.552331: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-12-21 14:22:03.638755: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-12-21 14:22:03.796114: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2023-12-21 14:22:03.797553: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-12-21 14:22:05.088232: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING:tensorflow:From /opt/mamba/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:108: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
2023-12-21 14:22:06.949781: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-12-21 14:22:06.950344: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2256] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2023-12-21 14:22:07.058034: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:388] MLIR V1 optimization pass is not enabled
2023-12-21 14:22:07.185529: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-12-21 14:22:07.185722: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2256] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
WARNING:tensorflow:From /opt/mamba/lib/python3.10/site-packages/deepmd/utils/batch_size.py:62: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
WARNING:tensorflow:From /opt/mamba/lib/python3.10/site-packages/deepmd/utils/batch_size.py:62: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2023-12-21 14:22:07.203600: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-12-21 14:22:07.203766: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2256] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
DEEPMD WARNING You can use the environment variable DP_INFER_BATCH_SIZE tocontrol the inference batch size (nframes * natoms). The default value is 1024.
DEEPMD INFO    # ---------------output of dp test--------------- 
DEEPMD INFO    # testing system : valid_predict
DEEPMD INFO    # number of test data : 2000 
DEEPMD INFO    Energy MAE         : 7.485752e-02 eV
DEEPMD INFO    Energy RMSE        : 9.166339e-02 eV
DEEPMD INFO    Energy MAE/Natoms  : 3.898829e-04 eV
DEEPMD INFO    Energy RMSE/Natoms : 4.774135e-04 eV
DEEPMD INFO    Force  MAE         : 2.743577e-02 eV/A
DEEPMD INFO    Force  RMSE        : 3.531224e-02 eV/A
DEEPMD INFO    Virial MAE         : 2.633329e-01 eV
DEEPMD INFO    Virial RMSE        : 3.400125e-01 eV
DEEPMD INFO    Virial MAE/Natoms  : 1.371525e-03 eV
DEEPMD INFO    Virial RMSE/Natoms : 1.770899e-03 eV
DEEPMD INFO    # ----------------------------------------------- 
代码
文本
DPA
DPA
已赞4
本文被以下合集收录
good notebooks collected by Taiping Hu
TaipingHu
更新于 2024-09-10
33 篇14 人关注
DPA-2
yuxiangc22
更新于 2024-09-03
5 篇5 人关注
推荐阅读
公开
DP-Gen based on a DPA-2 pretrained model
DPA
DPA
zjgemi
发布于 2023-12-21
4 赞4 转存文件2 评论
公开
面向大规模密度数据优化Lennard-Jones参数:DMFF应用示例
DMFF#经典分子力学 # 力场分子动力学
DMFF#经典分子力学 # 力场分子动力学
昌珺涵
发布于 2023-09-26
4 赞4 转存文件1 评论
评论
 Distillation can sig...

cxj

02-27 05:04
小建议:在这个或者其他的dpa2相关的notebook中补充直观的蒸馏前后精度对比
评论
 import random n_sel...

liujianchuan@pku.edu.cn

05-22 19:37
需要又import dpdata
评论