Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
快速开始 DeePMD-kit|训练甲烷深度势能分子动力学模型
Tutorial
DeePMD-kit
TutorialDeePMD-kit
MileAway
AnguseZhang
Mancn
爱学习的王一博
发布于 2023-06-09
推荐镜像 :DeePMD-kit:2.2.1-cuda11.6-notebook
推荐机型 :c8_m16_cpu
赞 89
149
544
DeePMD-kit Quick-Start(v1)

快速开始 DeePMD-kit|训练甲烷深度势能分子动力学模型

代码
文本

Open In Bohrium

代码
文本

©️ Copyright 2023 @ Authors
日期:2023-05-09
共享协议:本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
快速开始:点击上方的 开始连接 按钮,选择 deepmd-kit:2.2.1-cuda11.6-notebook 镜像及 c8_m16_cpu 节点配置,稍等片刻即可运行。

代码
文本

这是一篇「深度势能」分子动力学 DeePMD-kit 快速上手指南,你可以通过本文快速了解 DeePMD-kit 运行的范式周期并应用于你的项目中

与本文配套的视频报告可见下方链接,因视频网站设置,在本网页的显示清晰度可能不佳,可能需要访问原网站获取更清晰观看体验。

代码
文本

深度势能(Deep Potential)是机器学习和物理原理的碰撞,它展现了下图所示的一种新的计算范式。

Fig2

图|一种新的计算范式,由分子模拟(Molecular Modeling)、机器学习(Machine Learning)和高性能计算(HPC)组成。

如果你需要更深度的了解深度势能,可戳 👉 从 DFT 到 MD|超详细「深度势能」材料计算上手指南

代码
文本
代码
文本

目标

掌握使用 DeePMD-kit 建立深度势能分子动力学模型的范式周期,并跟随完整案例学习如何应用于分子动力学任务。

在学习本教程后,你将能够:

  • 了解 DeePMD-kit 训练所需的数据格式及运行脚本
  • 训练、冻结、 压缩和测试 DeePMD-kit 模型
  • 在分子动力学软件 LAMMPS 中调用 DeePMD-kit 进行计算。

更多关于DeePMD学习资料可以参考下面两个课程:

阅读该教程【最多】约需 20 分钟,让我们开始吧!

代码
文本

背景

在本教程中,我们将以气态甲烷分子为例,详细介绍深度势能(Deep Potential)模型的训练和应用。

DeePMD-kit 是一款基于神经网络拟合第一性原理数据得到势能模型,用于分子动力学模拟的软件。无需人工干预,其可以端对端地将用户提供的数据在数个小时内转化为深度势能模型,该模型可以与常见分子动力学模拟软件(LAMMPS,OpenMM 和 GROMACS 等)无缝衔接。

DeePMD-kit 通过高性能计算和机器学习将分子动力学极限提升了数个量级,达到了上亿原子的体系规模,同时仍保证了「从头算(ab initio)」的高精度,且模拟时间尺度较传统方法至少提高 1000 倍。相关成果获 2020 年度⾼性能计算领域最⾼奖 ACM 戈登贝尔奖,已被国内外物理、化学、材料、生物等方向上千课题组使用。

Fig1

有关更详细的用法,你可以使用 DeePMD-kit的文档 文档作为完整参考。

在本案例中,Deep Potential (DP) 模型是使用 DeePMD-kit 包(v2.2.1)生成的。

代码
文本

实践

代码
文本

1 数据准备

我们已经为您准备了运行 DeePMD-kit 计算所需的 初始数据,并将其放置在文件夹 DeePMD-kit_Tutorial 中。你可以在左侧点击数据集查看相应文件:

代码
文本
[1]
# 出于安全考虑,我们没有数据集所在文件夹的写入权限,因此我们将其复制到 `/data/` 目录下:
! cp -nr /bohr/ /personal/

# 我们在这里定义一些路径,并切换到工作路径,方便后续调用:
import os
bohr_dataset_url = "/bohr/deepmd-kit-8n4p/v1/" # url 可从左侧数据集复制
work_path = os.path.join("/personal", bohr_dataset_url[1:]) # 进行一个切片以删除上述路径中最开始的“/”
os.chdir(work_path)
print(f"当前路径为:{os.getcwd()}")
当前路径为:/personal/bohr/deepmd-kit-8n4p/v1
代码
文本

让我们来查看下载的 DeePMD-kit_Tutorial 文件夹:

代码
文本
[2]
! tree DeePMD-kit_Tutorial -L 1
DeePMD-kit_Tutorial
├── 00.data
├── 01.train
└── 02.lmp

3 directories, 0 files
代码
文本

DeePMD-kit_Tutorial 文件夹下有 00.data,01.train 和 02.lmp 共 3 个子文件夹。

  • 00.data 文件夹用于存放训练和测试数据,
  • 01.train 包含使用 DeePMD-kit 训练模型的示例脚本,
  • 02.lmp 包含用于分子动力学模拟的 LAMMPS 示例脚本。

让我们首先来查看 DeePMD-kit_Tutorial/00.data 文件夹。

代码
文本
[3]
! tree DeePMD-kit_Tutorial/00.data -L 1
DeePMD-kit_Tutorial/00.data
└── abacus_md

1 directory, 0 files
代码
文本

DeePMD-kit 的训练数据来源于第一性原理计算数据,包含原子类型、模拟晶格、原子坐标、原子力、系统能量和维里量。

image-20230116161737203

00.data 文件夹下仅有 abacus_md 文件夹,abacus_md 文件夹是通过使用 ABACUS 进行从头算分子动力学 (ab initio Molecular Dynamics, AIMD) 获得的。本教程中我们已经为您完成了甲烷分子的从头分子动力学计算。

有关 ABACUS 的详细说明可以在其文档中找到。你也可以在从 超详细「深度势能」材料计算上手指南|章节 2 中获得帮助。

DeePMD-kit 采用压缩数据格式。所有训练数据应首先转换为此格式,然后可以在 DeePMD-kit 中使用。该数据格式在 DeePMD-kit 手册中有详细解释,可以在DeePMD-kit Github中找到。

我们提供了一个方便的工具 dpdata,可以将由 VASP、CP2K、Gaussian、Quantum-Espresso、ABACUS 和 LAMMPS 产生的数据转换为 DeePMD-kit 的压缩格式。

具有计算数据信息的分子系统的快照(snapshot)称为帧。数据系统包括许多共享相同原子数和原子类型的帧。

例如,分子动力学轨迹可以转换为数据系统,其中每个时间步长对应于系统中的一帧。

代码
文本

接下来,我们使用 dpdata 工具将 abacus_md 中的数据随机分成训练和验证数据。

代码
文本
[4]
import dpdata
import numpy as np

# 读入 ABACUS/MD 格式的数据
data = dpdata.LabeledSystem('DeePMD-kit_Tutorial/00.data/abacus_md', fmt = 'abacus/md')
print('# 数据包含%d帧' % len(data))

# 随机选择40个索引作为验证集数据
index_validation = np.random.choice(201,size=40,replace=False)

# 其他索引作为训练集数据
index_training = list(set(range(201))-set(index_validation))
data_training = data.sub_system(index_training)
data_validation = data.sub_system(index_validation)

# 将所有训练数据放入文件夹"training_data"中
data_training.to_deepmd_npy('DeePMD-kit_Tutorial/00.data/training_data')

# 将所有验证数据放入文件夹"validation_data"中
data_validation.to_deepmd_npy('DeePMD-kit_Tutorial/00.data/validation_data')

print('# 训练数据包含%d帧' % len(data_training))
print('# 验证数据包含%d帧' % len(data_validation))
# 数据包含201帧
# 训练数据包含161帧
# 验证数据包含40帧
代码
文本

可以看到,161个帧被选为训练数据,其他40个帧是验证数据。

代码
文本

让我们再查看一下 00.data 文件夹,其中产生了新的文件,分别是 DeePMD-kit 深度势能训练所需的训练集和验证集。

代码
文本
[5]
! tree DeePMD-kit_Tutorial/00.data/ -L 1
DeePMD-kit_Tutorial/00.data/
├── abacus_md
├── training_data
└── validation_data

3 directories, 0 files
代码
文本
[6]
! tree DeePMD-kit_Tutorial/00.data/training_data -L 1
DeePMD-kit_Tutorial/00.data/training_data
├── set.000
├── type.raw
└── type_map.raw

1 directory, 2 files
代码
文本

这些文件的作用如下:

  1. set.000:是一个目录,包含压缩格式的数据(NumPy压缩数组)。
  2. type.raw:是一个文件,包含原子的类型(以整数表示)。
  3. type_map.raw:是一个文件,包含原子的类型名称。

让我们来看一下这些文件:

代码
文本
[7]
! cat DeePMD-kit_Tutorial/00.data/training_data/type.raw
0
0
0
0
1
代码
文本

这告诉我们这个例子中有5个原子,其中4个原子由类型"0"表示,1个原子由类型"1"表示。有时需要将整数类型映射到原子名称。映射可以通过文件type_map.raw给出。

代码
文本
[8]
! cat DeePMD-kit_Tutorial/00.data/training_data/type_map.raw
H
C
代码
文本

这告诉我们类型“0”被命名为“H”,类型“1”被命名为“C”。

有关使用 dpdata 进行数据转换的更详细文档可以在这里找到。

代码
文本

2 准备输入脚本

训练数据准备完成后,接下来就可以进行训练。DeePMD-kit 需要一个json格式的文件来指定训练参数。该文件称为 DeePMD-kit 的输入脚本,让我们进入训练目录看一下该输入脚本:

代码
文本
[9]
! cd DeePMD-kit_Tutorial/01.train/ && cat input.json
{
    "_comment": " model parameters",
    "model": {
	"type_map":	["H", "C"],
	"descriptor" :{
	    "type":		"se_e2_a",
	    "sel":		"auto",
	    "rcut_smth":	0.50,
	    "rcut":		6.00,
	    "neuron":		[25, 50, 100],
	    "resnet_dt":	false,
	    "axis_neuron":	16,
	    "seed":		1,
	    "_comment":		" that's all"
	},
	"fitting_net" : {
	    "neuron":		[240, 240, 240],
	    "resnet_dt":	true,
	    "seed":		1,
	    "_comment":		" that's all"
	},
	"_comment":	" that's all"
    },

    "learning_rate" :{
	"type":		"exp",
	"decay_steps":	50,
	"start_lr":	0.001,	
	"stop_lr":	3.51e-8,
	"_comment":	"that's all"
    },

    "loss" :{
	"type":		"ener",
	"start_pref_e":	0.02,
	"limit_pref_e":	1,
	"start_pref_f":	1000,
	"limit_pref_f":	1,
	"start_pref_v":	0,
	"limit_pref_v":	0,
	"_comment":	" that's all"
    },

    "training" : {
	"training_data": {
	    "systems":     ["../00.data/training_data"],
	    "batch_size":  "auto",
	    "_comment":	   "that's all"
	},
	"validation_data":{
	    "systems":	   ["../00.data/validation_data"],
	    "batch_size":  "auto",
	    "numb_btch":   1,
	    "_comment":	   "that's all"
	},
	"numb_steps":	10000,
	"seed":		10,
	"disp_file":	"lcurve.out",
	"disp_freq":	200,
	"save_freq":	1000,
	"_comment":	"that's all"
    },    

    "_comment":		"that's all"
}

代码
文本

在模型部分,指定了嵌入和拟合网络的参数。

"model":{
    "type_map":    ["H", "C"],                 
    "descriptor":{
        "type":            "se_e2_a",          
        "rcut":            6.00,               
        "rcut_smth":       0.50,               
        "sel":             "auto",             
        "neuron":          [25, 50, 100],       
        "resnet_dt":       false,
        "axis_neuron":     16,                  
        "seed":            1,
        "_comment":        "that's all"
        },
    "fitting_net":{
        "neuron":          [240, 240, 240],    
        "resnet_dt":       true,
        "seed":            1,
        "_comment":        "that's all"
    },
    "_comment":    "that's all"'
},

部分参数的解释如下:

参数 解释
type_map 每种原子的名称
descriptor > type 描述类型
descriptor > rcut 截断半径
descriptor > rcut_smth 平滑开始的位置
descriptor > sel 切割半径内第i种原子的最大数目
descriptor > neuron 嵌入神经网络的大小
descriptor > axis_neuron G矩阵的子矩阵大小(嵌入矩阵)
fitting_net > neuron 拟合神经网络的大小

使用se_e2_a描述符来训练DP模型。neurons的参数将描述符和拟合网络的大小分别设置为[25, 50, 100]和[240, 240, 240]。本地环境中的组成部分会在从0.5到6 Å的范围内平滑地趋于零。

以下是指定学习率和损失函数的参数。

    "learning_rate" :{
        "type":                "exp",
        "decay_steps":         50,
        "start_lr":            0.001,    
        "stop_lr":             3.51e-8,
        "_comment":            "that's all"
    },
    "loss" :{
        "type":                "ener",
        "start_pref_e":        0.02,
        "limit_pref_e":        1,
        "start_pref_f":        1000,
        "limit_pref_f":        1,
        "start_pref_v":        0,
        "limit_pref_v":        0,
        "_comment":            "that's all"
    },

在损失函数中,pref_e从 0.02 逐渐增加到 1,pref_f从 1000 逐渐减小到 1,这意味着力项在开始时占主导地位,而能量和压力项在最后变得重要。这种策略非常有效,并减少了总的训练时间。pref_v设置为 0,表示训练过程中不包括压力数据。起始学习率、终止学习率和衰减步数分别设置为 0.001、3.51e-8 和 50。模型训练 10000 步。

训练参数如下所示:

    "training" : {
        "training_data": {
            "systems":            ["../00.data/training_data"],     
            "batch_size":         "auto",                       
            "_comment":           "that's all"
        },
        "validation_data":{
            "systems":            ["../00.data/validation_data/"],
            "batch_size":         "auto",               
            "numb_btch":          1,
            "_comment":           "that's all"
        },
        "numb_steps":             10000,                           
        "seed":                   10,
        "disp_file":              "lcurve.out",
        "disp_freq":              200,
        "save_freq":              10000,
        },
代码
文本

3 训练模型

准备好训练脚本后,我们可以通过简单地运行 DeePMD-kit 来开始训练。

代码
文本
[10]
# ########## Time Warning: 8 mins 48 secs ##########
! cd DeePMD-kit_Tutorial/01.train/ && dp train input.json
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
WARNING:root:Environment variable KMP_BLOCKTIME is empty. Use the default value 0
WARNING:root:Environment variable KMP_AFFINITY is empty. Use the default value granularity=fine,verbose,compact,1,0
/opt/deepmd-kit-2.2.1/lib/python3.10/importlib/__init__.py:169: UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged.
  _bootstrap._exec(spec, module)
DEEPMD INFO    Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step)
2023-09-25 18:08:39.837014: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-09-25 18:08:39.837045: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
OMP: Info #155: KMP_AFFINITY: Initial OS proc set respected: 0,1
OMP: Info #216: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #216: KMP_AFFINITY: cpuid leaf 11 not supported.
OMP: Info #216: KMP_AFFINITY: decoding legacy APIC ids.
OMP: Info #157: KMP_AFFINITY: 2 available OS procs
OMP: Info #158: KMP_AFFINITY: Uniform topology
OMP: Info #287: KMP_AFFINITY: topology layer "LL cache" is equivalent to "socket".
OMP: Info #192: KMP_AFFINITY: 1 socket x 1 core/socket x 2 threads/core (1 total cores)
OMP: Info #218: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #172: KMP_AFFINITY: OS proc 0 maps to socket 0 core 0 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 1 maps to socket 0 core 0 thread 1 
OMP: Info #254: KMP_AFFINITY: pid 99 tid 109 thread 1 bound to OS proc set 1
OMP: Info #254: KMP_AFFINITY: pid 99 tid 111 thread 2 bound to OS proc set 0
OMP: Info #254: KMP_AFFINITY: pid 99 tid 108 thread 3 bound to OS proc set 1
OMP: Info #254: KMP_AFFINITY: pid 99 tid 112 thread 4 bound to OS proc set 0
DEEPMD INFO    training data with min nbor dist: 1.045920568611028
DEEPMD INFO    training data with max nbor size: [4 1]
DEEPMD INFO     _____               _____   __  __  _____           _     _  _   
DEEPMD INFO    |  __ \             |  __ \ |  \/  ||  __ \         | |   (_)| |  
DEEPMD INFO    | |  | |  ___   ___ | |__) || \  / || |  | | ______ | | __ _ | |_ 
DEEPMD INFO    | |  | | / _ \ / _ \|  ___/ | |\/| || |  | ||______|| |/ /| || __|
DEEPMD INFO    | |__| ||  __/|  __/| |     | |  | || |__| |        |   < | || |_ 
DEEPMD INFO    |_____/  \___| \___||_|     |_|  |_||_____/         |_|\_\|_| \__|
DEEPMD INFO    Please read and cite:
DEEPMD INFO    Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
DEEPMD INFO    installed to:         /home/conda/feedstock_root/build_artifacts/deepmd-kit_1678943793317/work/_skbuild/linux-x86_64-3.10/cmake-install
DEEPMD INFO    source :              v2.2.1
DEEPMD INFO    source brach:         HEAD
DEEPMD INFO    source commit:        3ac8c4c7
DEEPMD INFO    source commit at:     2023-03-16 12:33:24 +0800
DEEPMD INFO    build float prec:     double
DEEPMD INFO    build variant:        cuda
DEEPMD INFO    build with tf inc:    /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/include;/opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/../../../../include
DEEPMD INFO    build with tf lib:    
DEEPMD INFO    ---Summary of the training---------------------------------------
DEEPMD INFO    running on:           bohrium-14076-1043333
DEEPMD INFO    computing device:     cpu:0
DEEPMD INFO    CUDA_VISIBLE_DEVICES: unset
DEEPMD INFO    Count of visible GPU: 0
DEEPMD INFO    num_intra_threads:    0
DEEPMD INFO    num_inter_threads:    0
DEEPMD INFO    -----------------------------------------------------------------
DEEPMD INFO    ---Summary of DataSystem: training     -----------------------------------------------
DEEPMD INFO    found 1 system(s):
DEEPMD INFO                                        system  natoms  bch_sz   n_bch   prob  pbc
DEEPMD INFO                      ../00.data/training_data       5       7      23  1.000    T
DEEPMD INFO    --------------------------------------------------------------------------------------
DEEPMD INFO    ---Summary of DataSystem: validation   -----------------------------------------------
DEEPMD INFO    found 1 system(s):
DEEPMD INFO                                        system  natoms  bch_sz   n_bch   prob  pbc
DEEPMD INFO                    ../00.data/validation_data       5       7       5  1.000    T
DEEPMD INFO    --------------------------------------------------------------------------------------
DEEPMD INFO    training without frame parameter
DEEPMD INFO    data stating... (this step may take long time)
OMP: Info #254: KMP_AFFINITY: pid 99 tid 99 thread 0 bound to OS proc set 0
DEEPMD INFO    built lr
DEEPMD INFO    built network
DEEPMD INFO    built training
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
DEEPMD INFO    initialize model from scratch
DEEPMD INFO    start training at lr 1.00e-03 (== 1.00e-03), decay_step 50, decay_rate 0.950006, final lr will be 3.51e-08
DEEPMD INFO    batch     200 training time 11.42 s, testing time 0.24 s
DEEPMD INFO    batch     400 training time 9.99 s, testing time 0.04 s
DEEPMD INFO    batch     600 training time 9.90 s, testing time 0.04 s
DEEPMD INFO    batch     800 training time 9.90 s, testing time 0.04 s
DEEPMD INFO    batch    1000 training time 9.93 s, testing time 0.04 s
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    batch    1200 training time 9.91 s, testing time 0.03 s
DEEPMD INFO    batch    1400 training time 9.89 s, testing time 0.04 s
DEEPMD INFO    batch    1600 training time 9.96 s, testing time 0.04 s
DEEPMD INFO    batch    1800 training time 9.88 s, testing time 0.04 s
DEEPMD INFO    batch    2000 training time 9.91 s, testing time 0.03 s
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    batch    2200 training time 9.89 s, testing time 0.04 s
DEEPMD INFO    batch    2400 training time 9.87 s, testing time 0.04 s
DEEPMD INFO    batch    2600 training time 9.91 s, testing time 0.06 s
DEEPMD INFO    batch    2800 training time 9.93 s, testing time 0.04 s
DEEPMD INFO    batch    3000 training time 9.89 s, testing time 0.03 s
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    batch    3200 training time 9.89 s, testing time 0.03 s
DEEPMD INFO    batch    3400 training time 9.87 s, testing time 0.03 s
DEEPMD INFO    batch    3600 training time 9.89 s, testing time 0.03 s
DEEPMD INFO    batch    3800 training time 9.89 s, testing time 0.03 s
DEEPMD INFO    batch    4000 training time 9.98 s, testing time 0.04 s
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    batch    4200 training time 9.88 s, testing time 0.03 s
DEEPMD INFO    batch    4400 training time 9.87 s, testing time 0.03 s
DEEPMD INFO    batch    4600 training time 9.85 s, testing time 0.04 s
DEEPMD INFO    batch    4800 training time 9.88 s, testing time 0.03 s
DEEPMD INFO    batch    5000 training time 9.90 s, testing time 0.04 s
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    batch    5200 training time 9.91 s, testing time 0.04 s
DEEPMD INFO    batch    5400 training time 9.86 s, testing time 0.03 s
DEEPMD INFO    batch    5600 training time 9.88 s, testing time 0.04 s
DEEPMD INFO    batch    5800 training time 9.87 s, testing time 0.03 s
DEEPMD INFO    batch    6000 training time 9.90 s, testing time 0.03 s
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/training/saver.py:1066: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to delete files with this prefix.
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/training/saver.py:1066: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to delete files with this prefix.
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    batch    6200 training time 9.87 s, testing time 0.03 s
DEEPMD INFO    batch    6400 training time 9.97 s, testing time 0.03 s
DEEPMD INFO    batch    6600 training time 9.85 s, testing time 0.03 s
DEEPMD INFO    batch    6800 training time 9.90 s, testing time 0.04 s
DEEPMD INFO    batch    7000 training time 9.86 s, testing time 0.04 s
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    batch    7200 training time 9.90 s, testing time 0.04 s
DEEPMD INFO    batch    7400 training time 9.92 s, testing time 0.03 s
DEEPMD INFO    batch    7600 training time 9.91 s, testing time 0.04 s
DEEPMD INFO    batch    7800 training time 9.91 s, testing time 0.04 s
DEEPMD INFO    batch    8000 training time 9.87 s, testing time 0.03 s
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    batch    8200 training time 9.86 s, testing time 0.04 s
DEEPMD INFO    batch    8400 training time 9.86 s, testing time 0.04 s
DEEPMD INFO    batch    8600 training time 9.93 s, testing time 0.04 s
DEEPMD INFO    batch    8800 training time 9.86 s, testing time 0.04 s
DEEPMD INFO    batch    9000 training time 9.92 s, testing time 0.04 s
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    batch    9200 training time 9.85 s, testing time 0.04 s
DEEPMD INFO    batch    9400 training time 9.89 s, testing time 0.03 s
DEEPMD INFO    batch    9600 training time 9.84 s, testing time 0.03 s
DEEPMD INFO    batch    9800 training time 9.98 s, testing time 0.04 s
DEEPMD INFO    batch   10000 training time 9.86 s, testing time 0.03 s
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    average training time: 0.0495 s/batch (exclude first 200 batches)
DEEPMD INFO    finished training
DEEPMD INFO    wall time: 509.976 s
代码
文本

屏幕上会显示数据系统的信息,例如:

DEEPMD INFO    -----------------------------------------------------------------
DEEPMD INFO    ---Summary of DataSystem: training     ----------------------------------
DEEPMD INFO    found 1 system(s):
DEEPMD INFO                                 system  natoms  bch_sz   n_bch   prob  pbc
DEEPMD INFO               ../00.data/training_data       5       7      23  1.000    T
DEEPMD INFO    -------------------------------------------------------------------------
DEEPMD INFO    ---Summary of DataSystem: validation   ----------------------------------
DEEPMD INFO    found 1 system(s):
DEEPMD INFO                                 system  natoms  bch_sz   n_bch   prob  pbc
DEEPMD INFO             ../00.data/validation_data       5       7       5  1.000    T
DEEPMD INFO    -------------------------------------------------------------------------

以及该训练的起始和最终学习率:

DEEPMD INFO    start training at lr 1.00e-03 (== 1.00e-03), decay_step 50, decay_rate 0.950006, final lr will be 3.51e-08

如果一切正常,您将看到每 200 batch 打印的信息,例如:

DEEPMD INFO    batch     200 training time 6.04 s, testing time 0.02 s
DEEPMD INFO    batch     400 training time 4.80 s, testing time 0.02 s
DEEPMD INFO    batch     600 training time 4.80 s, testing time 0.02 s
DEEPMD INFO    batch     800 training time 4.78 s, testing time 0.02 s
DEEPMD INFO    batch    1000 training time 4.77 s, testing time 0.02 s
DEEPMD INFO    saved checkpoint model.ckpt
DEEPMD INFO    batch    1200 training time 4.47 s, testing time 0.02 s
DEEPMD INFO    batch    1400 training time 4.49 s, testing time 0.02 s
DEEPMD INFO    batch    1600 training time 4.45 s, testing time 0.02 s
DEEPMD INFO    batch    1800 training time 4.44 s, testing time 0.02 s
DEEPMD INFO    batch    2000 training time 4.46 s, testing time 0.02 s
DEEPMD INFO    saved checkpoint model.ckpt

它们显示了训练和测试时间计数。在每 1000 batch 结束时,模型将保存在 Tensorflow 的 checkpoint 文件 model.ckpt 中。

同时,训练和测试误差将在文件lcurve.out中呈现。该文件包含 8 列,从左到右依次是:

  1. 训练步数
  2. 验证损失
  3. 训练损失
  4. 能量的均方根(RMS)验证误差
  5. 能量的 RMS 训练误差
  6. 力的 RMS 验证误差
  7. 力的 RMS 训练误差
  8. 学习率

学习率是机器学习中的一个重要概念。在 DP 模型中,学习率会经历一个 从大到小指数衰减的过程。这样既能保证模型收敛的效率,也能保证模型的精度。因此在学习率的参数中,有起始学习率(start_lr)和结束学习率(end_rate) 两种。在上面的例子中,我们将起始学习率、结束学习率和学习率的衰减步长分别设置为 0.001,3.51e-8,和 50,那么模型学习率会从 0.001 开始,每 50 步降低一点,直到降低到 3.51e-8(或者训练结束)为止。

代码
文本

我们来看一下 lcurve.out 文件的初始与结束两行。

代码
文本
[11]
! cd DeePMD-kit_Tutorial/01.train/ && head -n 2 lcurve.out && tail -n 2 lcurve.out
#  step      rmse_val    rmse_trn    rmse_e_val  rmse_e_trn    rmse_f_val  rmse_f_trn         lr
      0      2.08e+01    1.83e+01      1.33e-01    1.33e-01      6.58e-01    5.79e-01    1.0e-03
   9800      3.88e-02    4.06e-02      7.06e-04    6.16e-04      3.80e-02    3.97e-02    4.3e-08
  10000      4.84e-02    3.64e-02      7.77e-04    4.19e-04      4.75e-02    3.58e-02    3.5e-08
代码
文本

可以可视化损失函数来监控训练过程。

代码
文本
[12]
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

with open("./DeePMD-kit_Tutorial/01.train/lcurve.out") as f:
headers = f.readline().split()[1:]
lcurve = pd.DataFrame(np.loadtxt("./DeePMD-kit_Tutorial/01.train/lcurve.out"), columns=headers)
legends = ["rmse_e_val", "rmse_e_trn", "rmse_f_val" , "rmse_f_trn" ]

for legend in legends:
plt.loglog(lcurve["step"], lcurve[legend], label = legend )
plt.legend()
plt.xlabel("Training steps")
plt.ylabel("Loss")
plt.show()
代码
文本

4 冻结模型

在训练结束时,应该将保存在 TensorFlow 的 checkpoint 文件中的模型参数冻结为一个模型文件,通常以扩展名 .pb 结束。只需执行以下命令:

代码
文本
[13]
# # 进入 DeePMD-kit_Tutorial/01.train/ 训练文件夹并冻结模型
! cd DeePMD-kit_Tutorial/01.train/ && dp freeze -o graph.pb
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
WARNING:root:Environment variable KMP_BLOCKTIME is empty. Use the default value 0
WARNING:root:Environment variable KMP_AFFINITY is empty. Use the default value granularity=fine,verbose,compact,1,0
/opt/deepmd-kit-2.2.1/lib/python3.10/importlib/__init__.py:169: UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged.
  _bootstrap._exec(spec, module)
2023-09-25 18:48:11.340773: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-09-25 18:48:11.340810: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
DEEPMD INFO    The following nodes will be frozen: ['model_type', 'descrpt_attr/rcut', 'descrpt_attr/ntypes', 'model_attr/tmap', 'model_attr/model_type', 'model_attr/model_version', 'train_attr/min_nbor_dist', 'train_attr/training_script', 'o_energy', 'o_force', 'o_virial', 'o_atom_energy', 'o_atom_virial', 'fitting_attr/dfparam', 'fitting_attr/daparam']
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/deepmd/entrypoints/freeze.py:354: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/deepmd/entrypoints/freeze.py:354: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/framework/convert_to_constants.py:925: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/framework/convert_to_constants.py:925: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
DEEPMD INFO    1211 ops in the final graph.
代码
文本

它将在当前目录中输出一个名为 graph.pb 的模型文件。

代码
文本

到目前为止,我们就获得了一个使用 DeePMD-kit 通过高精度的从头算分子动力学数据获得的深度势能模型:DeePMD-kit_Tutorial/01.train/graph.pb

代码
文本

5 压缩模型

压缩DP模型通常会将基于DP的计算速度提高一个数量级,并且消耗更少的内存。 graph.pb 可以通过以下方式压缩 :

代码
文本
[ ]
## Navigate to the DeePMD-kit_Tutorial/01.train/ Directory to Compress the Model
! cd DeePMD-kit_Tutorial/01.train.finished/ && dp compress -i graph.pb -o compress.pb
代码
文本

6 测试模型

让我们来检查一下训练模型的效果:

代码
文本
[14]
! cd DeePMD-kit_Tutorial/01.train/ && dp test -m graph.pb -s ../00.data/validation_data
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
WARNING:root:Environment variable KMP_BLOCKTIME is empty. Use the default value 0
WARNING:root:Environment variable KMP_AFFINITY is empty. Use the default value granularity=fine,verbose,compact,1,0
/opt/deepmd-kit-2.2.1/lib/python3.10/importlib/__init__.py:169: UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged.
  _bootstrap._exec(spec, module)
2023-09-25 18:48:20.769104: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-09-25 18:48:20.769133: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/deepmd/utils/batch_size.py:61: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/deepmd/utils/batch_size.py:61: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
DEEPMD WARNING You can use the environment variable DP_INFER_BATCH_SIZE tocontrol the inference batch size (nframes * natoms). The default value is 1024.
DEEPMD INFO    # ---------------output of dp test--------------- 
DEEPMD INFO    # testing system : ../00.data/validation_data
OMP: Info #155: KMP_AFFINITY: Initial OS proc set respected: 0,1
OMP: Info #216: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #216: KMP_AFFINITY: cpuid leaf 11 not supported.
OMP: Info #216: KMP_AFFINITY: decoding legacy APIC ids.
OMP: Info #157: KMP_AFFINITY: 2 available OS procs
OMP: Info #158: KMP_AFFINITY: Uniform topology
OMP: Info #287: KMP_AFFINITY: topology layer "LL cache" is equivalent to "socket".
OMP: Info #192: KMP_AFFINITY: 1 socket x 1 core/socket x 2 threads/core (1 total cores)
OMP: Info #218: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #172: KMP_AFFINITY: OS proc 0 maps to socket 0 core 0 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 1 maps to socket 0 core 0 thread 1 
OMP: Info #254: KMP_AFFINITY: pid 145 tid 150 thread 1 bound to OS proc set 1
OMP: Info #254: KMP_AFFINITY: pid 145 tid 152 thread 2 bound to OS proc set 0
OMP: Info #254: KMP_AFFINITY: pid 145 tid 149 thread 3 bound to OS proc set 1
OMP: Info #254: KMP_AFFINITY: pid 145 tid 153 thread 4 bound to OS proc set 0
DEEPMD INFO    # number of test data : 40 
DEEPMD INFO    Energy MAE         : 2.837528e-03 eV
DEEPMD INFO    Energy RMSE        : 3.636054e-03 eV
DEEPMD INFO    Energy MAE/Natoms  : 5.675056e-04 eV
DEEPMD INFO    Energy RMSE/Natoms : 7.272108e-04 eV
DEEPMD INFO    Force  MAE         : 2.918619e-02 eV/A
DEEPMD INFO    Force  RMSE        : 3.885977e-02 eV/A
DEEPMD INFO    Virial MAE         : 4.011785e-02 eV
DEEPMD INFO    Virial RMSE        : 5.432392e-02 eV
DEEPMD INFO    Virial MAE/Natoms  : 8.023570e-03 eV
DEEPMD INFO    Virial RMSE/Natoms : 1.086478e-02 eV
DEEPMD INFO    # ----------------------------------------------- 
代码
文本

让我们计算预测数据和原始数据之间的相关性并可视化查看一下。

代码
文本
[15]
import dpdata

training_systems = dpdata.LabeledSystem("./DeePMD-kit_Tutorial/00.data/training_data", fmt = "deepmd/npy") # 得到训练数据点
predict = training_systems.predict("./DeePMD-kit_Tutorial/01.train/graph.pb") # 得到预测数据点
2023-09-25 18:48:26.441179: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-25 18:48:28.960609: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-09-25 18:48:28.961650: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-09-25 18:48:28.961667: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
WARNING:tensorflow:From /opt/mamba/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
WARNING:tensorflow:From /opt/mamba/lib/python3.10/site-packages/deepmd/utils/batch_size.py:61: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2023-09-25 18:48:30.816197: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-25 18:48:30.819940: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-09-25 18:48:30.819974: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)
2023-09-25 18:48:30.819992: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (bohrium-14076-1043333): /proc/driver/nvidia/version does not exist
2023-09-25 18:48:30.835733: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled
WARNING:tensorflow:From /opt/mamba/lib/python3.10/site-packages/deepmd/utils/batch_size.py:61: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
WARNING:deepmd.utils.batch_size:You can use the environment variable DP_INFER_BATCH_SIZE tocontrol the inference batch size (nframes * natoms). The default value is 1024.
代码
文本
[16]
import matplotlib.pyplot as plt
import numpy as np

plt.scatter(training_systems["energies"], predict["energies"])

x_range = np.linspace(plt.xlim()[0], plt.xlim()[1])

plt.plot(x_range, x_range, "r--", linewidth = 0.25)
plt.xlabel("Energy of DFT") # 设置 x 轴标题
plt.ylabel("Energy predicted by deep potential") # 设置 y 轴标题
plt.show()
代码
文本

7 使用 LAMMPS 进行 MD 计算

该模型可以驱动 LAMMPS 中的分子动力学模拟。

代码
文本
[17]
! cd ./DeePMD-kit_Tutorial/02.lmp && cp ../01.train/graph.pb ./ && tree -L 1
.
├── conf.lmp
├── graph.pb
└── in.lammps

0 directories, 3 files
代码
文本

这里的conf.lmp给出了气相甲烷分子动力学模拟的初始构型。

代码
文本

文件in.lammps是LAMMPS的输入脚本。你可以检查in.lammps,可以发现它是一个相当标准的 LAMMPS 分子动力学模拟输入文件(关于 LAMMPS 分子动力学模拟输入文件的更多信息,可以阅读「超详细「深度势能」材料计算上手指南|章节 1」

其中只有两行例外:

pair_style  deepmd graph.pb
pair_coeff  * *

其中调用了 DeePMD 的 pair_style,提供了模型文件 graph.pb,这意味着原子间相互作用将由存储在文件graph.pb中的 DP 模型进行计算。

在具有兼容版本的 LAMMPS 的环境中,可以通过以下命令执行深度势分子动力学模拟:

代码
文本
[19]
! cd ./DeePMD-kit_Tutorial/02.lmp && lmp -i in.lammps
Warning:
This LAMMPS executable is in a conda environment, but the environment has
not been activated. Libraries may fail to load. To activate this environment
please see https://conda.io/activation.
LAMMPS (23 Jun 2022 - Update 1)
OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98)
  using 1 OpenMP thread(s) per MPI task
Loaded 1 plugins from /opt/deepmd-kit-2.2.1/lib/deepmd_lmp
Reading data file ...
  triclinic box = (0 0 0) to (10.114259 10.263124 10.216793) with tilt (0.036749877 0.13833062 -0.056322169)
  1 by 1 by 1 MPI processor grid
  reading atoms ...
  5 atoms
  read_data CPU = 0.006 seconds
DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
Summary of lammps deepmd module ...
  >>> Info of deepmd-kit:
  installed to:       /opt/deepmd-kit-2.2.1
  source:             v2.2.1
  source branch:       HEAD
  source commit:      3ac8c4c7
  source commit at:   2023-03-16 12:33:24 +0800
  surpport model ver.:1.1 
  build variant:      cuda
  build with tf inc:  /opt/deepmd-kit-2.2.1/include;/opt/deepmd-kit-2.2.1/include
  build with tf lib:  /opt/deepmd-kit-2.2.1/lib/libtensorflow_cc.so
  set tf intra_op_parallelism_threads: 0
  set tf inter_op_parallelism_threads: 0
  >>> Info of lammps module:
  use deepmd-kit at:  /opt/deepmd-kit-2.2.1DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information.
DeePMD-kit: Successfully load libcudart.so
2023-04-26 21:15:18.066727: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-04-26 21:15:18.068055: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2023-04-26 21:15:18.068079: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2023-04-26 21:15:18.068098: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (bohrium-14076-1015294): /proc/driver/nvidia/version does not exist
2023-04-26 21:15:18.068145: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2023-04-26 21:15:18.105025: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
  >>> Info of model(s):
  using   1 model(s): graph.pb 
  rcut in model:      6
  ntypes in model:    2

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Your simulation uses code contributions which should be cited:
- USER-DEEPMD package:
The log file lists these citations in BibTeX format.

CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE

Generated 0 of 1 mixed pair_coeff terms from geometric mixing rule
Neighbor list info ...
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 7
  ghost atom cutoff = 7
  binsize = 3.5, bins = 3 3 3
  1 neighbor lists, perpetual/occasional/extra = 1 0 0
  (1) pair deepmd, perpetual
      attributes: full, newton on
      pair build: full/bin/atomonly
      stencil: full/bin/3d
      bin: standard
Setting up Verlet run ...
  Unit style    : metal
  Current step  : 0
  Time step     : 0.001
Per MPI rank memory allocation (min/avg/max) = 3.809 | 3.809 | 3.809 Mbytes
   Step         PotEng         KinEng         TotEng          Temp          Press          Volume    
         0  -219.77215      0.025852029   -219.7463       50            -801.35984      1060.5429    
       100  -219.76396      0.017664638   -219.7463       34.164897     -710.62981      1060.5429    
       200  -219.77149      0.024122186   -219.74736      46.654338     -711.96738      1060.5429    
       300  -219.7802       0.031134653   -219.74906      60.21704      -611.8964       1060.5429    
       400  -219.78511      0.034345795   -219.75076      66.427658     -543.74306      1060.5429    
       500  -219.77938      0.026617371   -219.75276      51.480235     -256.92332      1060.5429    
       600  -219.77811      0.022678846   -219.75543      43.862797     -169.79613      1060.5429    
       700  -219.78325      0.024310525   -219.75894      47.018601      103.54304      1060.5429    
       800  -219.78936      0.026896734   -219.76246      52.020547      334.4249       1060.5429    
       900  -219.78861      0.022673959   -219.76593      43.853346      457.61058      1060.5429    
      1000  -219.78237      0.013228071   -219.76914      25.584203      613.98549      1060.5429    
      1100  -219.78334      0.011011633   -219.77233      21.297426      581.19738      1060.5429    
      1200  -219.78828      0.013230047   -219.77505      25.588025      428.34763      1060.5429    
      1300  -219.79344      0.016537932   -219.77691      31.985752      88.86934       1060.5429    
      1400  -219.78974      0.01196994    -219.77777      23.150872     -177.09109      1060.5429    
      1500  -219.7875       0.0093263051  -219.77818      18.037859     -423.20199      1060.5429    
      1600  -219.78605      0.0078568877  -219.77819      15.195882     -510.2681       1060.5429    
      1700  -219.78998      0.012262459   -219.77772      23.716627     -465.94061      1060.5429    
      1800  -219.7906       0.014094553   -219.77651      27.260052     -220.36254      1060.5429    
      1900  -219.78914      0.014923686   -219.77421      28.863664      75.170119      1060.5429    
      2000  -219.78455      0.014342543   -219.77021      27.739685      409.182        1060.5429    
      2100  -219.78259      0.018506926   -219.76408      35.793953      617.28225      1060.5429    
      2200  -219.78156      0.02406513    -219.7575       46.543987      708.58178      1060.5429    
      2300  -219.77987      0.027365758   -219.7525       52.927678      720.15556      1060.5429    
      2400  -219.77592      0.026288183   -219.74964      50.843558      739.34675      1060.5429    
      2500  -219.77697      0.028798823   -219.74817      55.699349      626.33539      1060.5429    
      2600  -219.78053      0.032950743   -219.74758      63.729511      406.56382      1060.5429    
      2700  -219.78321      0.036132461   -219.74708      69.883221      344.97396      1060.5429    
      2800  -219.78393      0.03656966    -219.74736      70.728801      230.07307      1060.5429    
      2900  -219.78068      0.032747595   -219.74793      63.336605     -31.188317      1060.5429    
      3000  -219.77856      0.029617088   -219.74895      57.281942     -89.082493      1060.5429    
      3100  -219.77995      0.029669541   -219.75028      57.38339      -251.79091      1060.5429    
      3200  -219.78257      0.030832857   -219.75174      59.63334      -491.45316      1060.5429    
      3300  -219.78018      0.026659797   -219.75352      51.562291     -582.64486      1060.5429    
      3400  -219.77748      0.021998758   -219.75548      42.547449     -674.39272      1060.5429    
      3500  -219.77262      0.015169959   -219.75745      29.339977     -683.60986      1060.5429    
      3600  -219.77832      0.018848271   -219.75947      36.454143     -677.52481      1060.5429    
      3700  -219.78189      0.020534924   -219.76136      39.716271     -586.26573      1060.5429    
      3800  -219.78723      0.024025438   -219.76321      46.467219     -436.71237      1060.5429    
      3900  -219.78487      0.02016304    -219.76471      38.997017     -89.299256      1060.5429    
      4000  -219.78528      0.019459969   -219.76582      37.637218      122.11424      1060.5429    
      4100  -219.78381      0.017357567   -219.76645      33.570996      430.86402      1060.5429    
      4200  -219.78769      0.020829206   -219.76686      40.285437      570.56462      1060.5429    
      4300  -219.78391      0.016952217   -219.76695      32.787015      630.93528      1060.5429    
      4400  -219.78372      0.016997858   -219.76672      32.875289      582.68207      1060.5429    
      4500  -219.78346      0.017577192   -219.76589      33.99577       417.88743      1060.5429    
      4600  -219.78642      0.022365458   -219.76406      43.256679      258.42794      1060.5429    
      4700  -219.78662      0.025541027   -219.76108      49.398497     -51.503321      1060.5429    
      4800  -219.78269      0.025961485   -219.75673      50.211697     -236.48923      1060.5429    
      4900  -219.78036      0.029177037   -219.75118      56.430845     -408.65375      1060.5429    
      5000  -219.77248      0.028022633   -219.74446      54.198132     -534.0945       1060.5429    
Loop time of 10.5269 on 1 procs for 5000 steps with 5 atoms

Performance: 41.038 ns/day, 0.585 hours/ns, 474.972 timesteps/s
263.6% CPU use with 1 MPI tasks x 1 OpenMP threads

MPI task timing breakdown:
Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
Pair    | 10.484     | 10.484     | 10.484     |   0.0 | 99.59
Neigh   | 0.006353   | 0.006353   | 0.006353   |   0.0 |  0.06
Comm    | 0.011002   | 0.011002   | 0.011002   |   0.0 |  0.10
Output  | 0.0040709  | 0.0040709  | 0.0040709  |   0.0 |  0.04
Modify  | 0.016528   | 0.016528   | 0.016528   |   0.0 |  0.16
Other   |            | 0.005208   |            |       |  0.05

Nlocal:              5 ave           5 max           5 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost:            130 ave         130 max         130 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs:              0 ave           0 max           0 min
Histogram: 1 0 0 0 0 0 0 0 0 0
FullNghs:           20 ave          20 max          20 min
Histogram: 1 0 0 0 0 0 0 0 0 0

Total # of neighbors = 20
Ave neighs/atom = 4
Neighbor list builds = 500
Dangerous builds not checked
Total wall time: 0:00:11
代码
文本

参考

  1. Han Wang, Linfeng Zhang, Jiequn Han, and Weinan E. DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics. Comput. Phys. Comm., 228:178–184, 2018. doi:10.1016/j.cpc.2018.03.016.
  2. Jinzhe Zeng, Duo Zhang, Denghui Lu, Pinghui Mo, Zeyu Li, Yixiao Chen, Marián Rynik, Li'ang Huang, Ziyao Li, Shaochen Shi, Yingze Wang, Haotian Ye, Ping Tuo, Jiabin Yang, Ye Ding, Yifan Li, Davide Tisi, Qiyu Zeng, Han Bao, Yu Xia, Jiameng Huang, Koki Muraoka, Yibo Wang, Junhan Chang, Fengbo Yuan, Sigbjørn Løland Bore, Chun Cai, Yinnian Lin, Bo Wang, Jiayan Xu, Jia-Xin Zhu, Chenxing Luo, Yuzhi Zhang, Rhys E. A. Goodall, Wenshuo Liang, Anurag Kumar Singh, Sikai Yao, Jingchao Zhang, Renata Wentzcovitch, Jiequn Han, Jie Liu, Weile Jia, Darrin M. York, Weinan E, Roberto Car, Linfeng Zhang, and Han Wang. DeePMD-kit v2: A software package for Deep Potential models. 2023. doi:10.48550/arXiv.2304.09409.
  3. https://docs.deepmodeling.com/projects/deepmd/en/master/index.html
  4. https://github.com/deepmodeling/deepmd-kit
代码
文本

Open In Bohrium

代码
文本
Tutorial
DeePMD-kit
TutorialDeePMD-kit
已赞89
本文被以下合集收录
机器学习与DFT精华帖
gtang
更新于 2024-09-10
38 篇21 人关注
DeepMD
ytchen@szu.edu.cn
更新于 2024-09-04
10 篇14 人关注
推荐阅读
公开
浙大暑期学校——快速开始 DeePMD-kit|训练甲烷深度势能分子动力学模型
TutorialDeePMD-kit
TutorialDeePMD-kit
Letian
更新于 2024-08-27
1 赞1 转存文件
公开
快速开始 DeePMD-kit|训练甲烷深度势能分子动力学模型Copy
TutorialDeePMD-kit
TutorialDeePMD-kit
bohr0fcdc0
更新于 2024-07-18
5 转存文件
评论
 ## 目标 > **掌握使用 DeeP...

AnguseZhang

2023-07-14
上面“从DFT到MD | 超详细深度势能材料计算上手指南”的Link换成案例广场的例子?

bohr482aa0

05-16 09:55
请问deepmd-kit可以训练混合体系的深度势能模型嘛??不同体系是指该体系元素原子都不一样,有的是Li S,有的是其他P Cl元素这种,这样可以训练一个混合的模型嘛?
评论
 import dpdata impor...

xck

07-17 04:02
请问张老师,如果我的DFT数据中包含电子温度,即有fparam.raw文件,利用dpdata如何体现在训练和验证集中?目前,我运行后,训练集和验证集中没有fparam.npy文件?
评论
 ! cat DeePMD-kit_Tut...

bohr482aa0

05-16 10:00
请问deepmd-kit可以训练混合模型的势函数嘛?就是不同元素不同原子数的那种,是不是需要修改type.raw好让它映射呢?这样训练的深度势能模型该如何提高精度呢?求解答
评论