空间站广场

论文

Notebooks

比赛

课程

Apps

我的主页

我的Notebooks

我的论文库

我的足迹

我的工作空间

任务

节点

文件

数据集

镜像

项目

数据库

公开

快速上手使用 DP-GEN + CALYPSO | 加速新材料发现

dpgen

CALYPSO

结构搜索

dpgenCALYPSO结构搜索

rainvibe

发布于 2024-03-25

推荐镜像 :dp222-dpgen-calypso:v0.1

推荐机型 :c2_m4_cpu

数据集

dpgen-calypso-MgAl(v1)

dp-calypso-example(v1)

快速上手使用 DP-GEN + CALYPSO 加速新材料发现

代码

文本

©️ Copyright 2024 @ Authors
作者：王振雨(wzy@calypso.cn)
日期：2024-03-25
共享协议：本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
快速开始：你可以点击界面上方蓝色按钮开始连接，选择 `dp222-dpgen-calypso:v0.1` 镜像及`c2_m4_cpu`节点配置稍等片刻即可运行。

代码

文本

目标

使用DPGEN结合CALYPSO进行势能面的快速探索并构建势函数

在学习本教程后，你将能够：

学会安装dpgen和dpdata
使用dp结合CALYPSO进行MLP加速的结构预测
使用dpgen结合CALYPSO进行适用于结构预测的势函数的构建

阅读该教程【最多】约需 10 分钟(算例计算完成大概需要1个小时)，让我们开始吧！

代码

文本

背景

本教程将以Mg-Al二元合金体系为例，介绍使用DPGEN结合CALYPSO进行Mg-Al势能面的探索和DP势函数的训练。支持使用CALYPSO的变配比结构搜索进行势能面采样，目前仅支持晶体。

代码

文本

1. 准备工作

代码

文本

1.1 环境准备

镜像中已经安装了DPGEN+CALYPSO+DeepMD，为了确保运行正确，我们可以先设置环境变量。

代码

文本

[1]

import os

os.environ["PATH"]=os.getenv("PATH")+":/opt/deepmd/deepmd-kit-222/bin"

os.getenv("PATH")

'/root/.local/bin:/CALYPSO-Bohrium/utils:/opt/intel/oneapi/mkl/2023.2.0/bin/intel64:/opt/intel/oneapi/mpi/2021.7.1//libfabric/bin:/opt/intel/oneapi/mpi/2021.7.1//bin:/opt/intel/oneapi/debugger/2021.7.1/gdb/intel64/bin:/opt/intel/oneapi/compiler/2022.2.1/linux/bin/intel64:/opt/intel/oneapi/compiler/2022.2.1/linux/bin:/opt/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/deepmd/deepmd-kit-222/bin'

代码

文本

[31]

!which dp

/opt/deepmd/deepmd-kit-222/bin/dp

代码

文本

[32]

!dpgen -h #查看是否安装成功

DeepModeling
------------
Version: 0.10.1.dev310+g4719ba1
Path: /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/dpgen

Dependency
------------
numpy 1.25.0 /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/numpy
dpdata 0.2.15 /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/dpdata
pymatgen unknown version or path
monty 2024.2.26 /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/monty
ase 3.22.1 /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/ase
paramiko 3.4.0 /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/paramiko
custodian 2024.3.12 /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/custodian

Reference
------------
Please cite:
Yuzhi Zhang, Haidi Wang, Weijie Chen, Jinzhe Zeng, Linfeng Zhang, Han Wang, and Weinan E,
DP-GEN: A concurrent learning platform for the generation of reliable deep learning
based potential energy models, Computer Physics Communications, 2020, 107206.
------------

Description
------------
usage: dpgen [-h]
{init_surf,init_bulk,auto_gen_param,init_reaction,run,run/report,collect,simplify,autotest,db}
...

dpgen is a convenient script that uses DeepGenerator to prepare initial data,
drive DeepMDkit and analyze results. This script works based on several sub-
commands with their own options. To see the options for the sub-commands, type
"dpgen sub-command -h".

positional arguments:
{init_surf,init_bulk,auto_gen_param,init_reaction,run,run/report,collect,simplify,autotest,db}
init_surf Generating initial data for surface systems.
init_bulk Generating initial data for bulk systems.
auto_gen_param auto gen param.json
init_reaction Generating initial data for reactive systems.
run Main process of Deep Potential Generator.
run/report Report the systems and the thermodynamic conditions of
the labeled frames.
collect Collect data.
simplify Simplify data.
autotest Auto-test for Deep Potential.
db Collecting data from DP-GEN.

options:
-h, --help show this help message and exit

代码

文本

如果是在自有集群上使用dpgen+calypso,可按照以下步骤安装：

使用conda安装deepmdkit：conda create -n deepmd_222_cpu deepmd-kit=2.2.2=*cpu libdeepmd=2.2.2=*cpu lammps horovod -c https://conda.deepmodeling.com -c defaults or conda create -n deepmd_222_gpu deepmd-kit=2.2.2=*gpu libdeepmd=2.2.2=*gpu lammps cudatoolkit=11.6 horovod -c https://conda.deepmodeling.com -c defaults
激活环境：conda activate deepmd_222_gpu
下载dpgen：pip install git+https://github.com/wangzyphysics/dpgen.git@devel
获取CALYPSO：http://calypso.cn/getting-calypso/

至此环境准备完毕

代码

文本

1.2 下载算例

代码

文本

[15]

#进入/personal目录，从数据集中下载算例

!cp -r /bohr/dpgen-calypso-ia74/v1 /personal/dpgen-calypso-example

sh: 0: getcwd() failed: No such file or directory

代码

文本

[2]

cd /personal/dpgen-calypso-example

/personal/dpgen-calypso-example

代码

文本

[17]

dpgen-calypso.zip*

代码

文本

[19]

!unzip dpgen-calypso.zip

已隐藏输出

代码

文本

[34]

cd dpgen-calypso

/personal/dpgen-calypso-example/dpgen-calypso

代码

文本

[26]

ls *

MgAl.json  machine.json  run.sh

calypso_input:
calypso.x*  input.dat  input.dat.Al  input.dat.Mg  input.dat.MgAl

data:
trainingset/

vasp_input:
INCAR  POTCAR  POTCAR.Al  POTCAR.Mg

代码

文本

2. dpgen+calypso参数文件说明及运行

代码

文本

当前目录中共有六个文件(夹)，其中：

MgAl.json: dpgen的主要控制文件，包含了训练、采样和dft计算的参数
machine.json: 计算资源的控制文件
run.sh: 提交任务脚本
calypso_input: 目录中包含了calypso的输入参数文件input.dat
data: 初始训练集，由500个CALYPSO产生的随机结构的单点组成
vasp_input: 第一性原理软件VASP的输入文件

代码

文本

2.1 dpgen输入文件：machine.json

machine.json为计算资源配置文件，参数的详细说明请参考dpdispatcher的文档

后续使用时需注意：

command为执行的命令，已经帮大家写好，需要注意，在model devi部分除command外，还需要提供calypso的绝对路径calypso_path和deepmd的python路径deepmdkit_python(使用DP对结构优化并计算model deviation)
email/password为Bohrium平台的账号和密码
program_id为相应的项目号
scass_type为调用的机器型号
image_name为调用的镜像名
group_size为每个计算节点运行的任务数

代码

文本

[27]

cat machine.json

{
  "api_version": "1.0",
  "deepmd_version": "2.2.2",

  "train" :[
    {
      "command": "/opt/deepmd/deepmd-kit-222/bin/dp",
      "machine": {
        "batch_type": "Bohrium",
        "context_type": "BohriumContext",
        "local_root" : "./",
        "remote_profile":{
          "email": "",
          "password": "",
          "program_id": 00000,
            "keep_backup":true,
            "input_data":{
                "log_file": "00*/train.log",
                "grouped":true,
                "job_name": "dpgen_train_job",
                "disk_size": 100,
                "scass_type":"c8_m32_1 * NVIDIA V100",
                "checkpoint_files":["00*/checkpoint","00*/model.ckpt*"],
                "checkpoint_time":30,
                "platform": "ali",
                "job_type": "container",
                "image_address":"registry.dp.tech/dptech/prod-11265/dp222-dpgen-calypso:v0.1",
                "on_demand":0
            }
        }
      },
      "resources": {
        "number_node": 1,
        "cpu_per_node": 8,
        "gpu_per_node": 1,
        "queue_name": "V100_8_32",
        "group_size": 1,
          "custom_flags": [],
          "strategy": {"if_cuda_multi_devices": true},
          "para_deg": 3,
          "source_list": []
      }
    }],

  "model_devi":[
    {
      "_comment_1": "calypso_path is the local calypso path",
      "calypso_path":"/usr/local/bin/",
      "_comment_2": "deepmdkit_python is the remote machine python path",
      "deepmdkit_python":"/opt/deepmd/deepmd-kit-222/bin/python",
      "command": "",
      "machine": {
        "batch_type": "Bohrium",
        "context_type": "BohriumContext",
        "local_root" : "./",
        "remote_profile":{
            "email": "",
            "password": "",
            "program_id": 0000,
            "keep_backup":true,
            "input_data":{
              "log_file": "*/model_devi.log",
              "grouped":true,
              "job_name": "dpgen_model_devi_job",
              "disk_size": 200,
              "scass_type":"c8_m32_1 * NVIDIA V100",
              "platform": "ali",
              "job_type": "container",
              "image_address":"registry.dp.tech/dptech/prod-11265/dp222-dpgen-calypso:v0.1",
              "on_demand":0
            }
        }
      },
      "resources": {
        "number_node": 1,
        "cpu_per_node": 8,
        "gpu_per_node": 1,
        "queue_name": "V100_8_32",
        "group_size": 50,
          "source_list": []
      }
    }],

  "fp":[
    {
      "command": "ulimit -s unlimited; mpirun -np 16 vasp_std",
      "machine": {
        "batch_type": "Bohrium",
        "context_type": "BohriumContext",
        "local_root" : "./",
        "remote_profile":{
            "email": "",
            "password": "",
            "program_id": 0000,
            "input_data":{
              "api_version":2,
              "log_file": "task*/fp.log",
              "grouped":true,
              "job_name": "dpgen_fp_job",
              "disk_size": 100,
              "scass_type":"c16_m32_cpu",
              "platform": "ali",
              "job_type":"container",
              "image_address":"registry.dp.tech/dptech/vasp:5.4.4-calypso",
              "on_demand": 0
            }
        }
      },
      "resources": {
        "number_node": 1,
        "cpu_per_node": 8,
        "gpu_per_node": 0,
        "queue_name": "CPU",
        "group_size": 10,
        "source_list": ["/opt/intel/oneapi/setvars.sh"]
      }
    }
]
}

代码

文本

[29]

%%writefile machine.json

{

"api_version": "1.0",

"deepmd_version": "2.2.2",

"train" :[

{

"command": "/opt/deepmd/deepmd-kit-222/bin/dp",

"machine": {

"batch_type": "Bohrium",

"context_type": "BohriumContext",

"local_root" : "./",

"remote_profile":{

"email": "wzy@calypso.cn",

"password": "xxxx",

"program_id": xxxxx,

"keep_backup":true,

"input_data":{

"log_file": "00*/train.log",

"grouped":true,

"job_name": "dpgen_train_job",

"disk_size": 100,

"scass_type":"c8_m32_1 * NVIDIA V100",

"checkpoint_files":["00*/checkpoint","00*/model.ckpt*"],

"checkpoint_time":30,

"platform": "ali",

"job_type": "container",

"image_address":"registry.dp.tech/dptech/prod-11265/dp222-dpgen-calypso:v0.1",

"on_demand":0

}

"resources": {

"number_node": 1,

"cpu_per_node": 8,

"gpu_per_node": 1,

"queue_name": "V100_8_32",

"group_size": 1,

"custom_flags": [],

"strategy": {"if_cuda_multi_devices": true},

"para_deg": 3,

"source_list": []

}

}],

"model_devi":[

{

"_comment_1": "calypso_path is the local calypso path",

"calypso_path":"/usr/local/bin/",

"_comment_2": "deepmdkit_python is the remote machine python path",

"deepmdkit_python":"/opt/deepmd/deepmd-kit-222/bin/python",

"command": "",

"machine": {

"batch_type": "Bohrium",

"context_type": "BohriumContext",

"local_root" : "./",

"remote_profile":{

"email": "wzy@calypso.cn",

"password": "zhenyu1234!",

"program_id": 12361,

"keep_backup":true,

"input_data":{

"log_file": "*/model_devi.log",

"grouped":true,

"job_name": "dpgen_model_devi_job",

"disk_size": 200,

"scass_type":"c8_m32_1 * NVIDIA V100",

"platform": "ali",

"job_type": "container",

"image_address":"registry.dp.tech/dptech/prod-11265/dp222-dpgen-calypso:v0.1",

"on_demand":0

}

"resources": {

"number_node": 1,

"cpu_per_node": 8,

"gpu_per_node": 1,

"queue_name": "V100_8_32",

"group_size": 50,

"source_list": []

}

}],

"fp":[

{

"command": "ulimit -s unlimited; mpirun -np 16 vasp_std",

"machine": {

"batch_type": "Bohrium",

"context_type": "BohriumContext",

"local_root" : "./",

"remote_profile":{

"email": "wzy@calypso.cn",

"password": "zhenyu1234!",

"program_id": 12361,

"input_data":{

"api_version":2,

"log_file": "task*/fp.log",

"grouped":true,

"job_name": "dpgen_fp_job",

"disk_size": 100,

"scass_type":"c16_m32_cpu",

"platform": "ali",

"job_type":"container",

"image_address":"registry.dp.tech/dptech/vasp:5.4.4-calypso",

"on_demand": 0

}

"resources": {

"number_node": 1,

"cpu_per_node": 8,

"gpu_per_node": 0,

"queue_name": "CPU",

"group_size": 10,

"source_list": ["/opt/intel/oneapi/setvars.sh"]

}

]

}

Overwriting machine.json

代码

文本

2.2 dpgen输入文件：MgAl.json

”MgAljson“为dpgen run的主要参数设置文件。

基于CALYPSO构建势函数需要注意以下几个参数：

model_devi_engine: str, 默认为lmp，需要替换为calypso.
calypso_input_path: str, CALYPSO的输入参数文件绝对路径，当提供该路径后，model_devi_jobs不起作用
model_devi_max_iter: int, 最大迭代次数，超过最大迭代次数后程序会停止，当calypso_input_path存在时起作用
model_devi_jobs: List[dict], 类似lmp控制采样的形式，可控制参数包括：
- model_devi_jobs["times"], List[int], 列表中的数字代表iteration的index，该列表中的index会使用相同的calypso输入文件
- model_devi_jobs["NameOfAtoms"]，List[str], 目标结构的元素列表
- model_devi_jobs["NumberOfAtoms"]，List[int]，一倍分子式中每种元素的原子数
- model_devi_jobs["NumberOfFormula"]，List[int]，产生结构的分子式范围
- model_devi_jobs["Volume"]，List[float]，一倍分子式的体积，A^3
- model_devi_jobs["DistanceOfIon"]，List[List[float]]，不同元素之间允许的最短距离 Angstrom
- model_devi_jobs["PsoRatio"]，float，使用演化算法产生结构的比例
- model_devi_jobs["PopSize"]，int，每一代产生结构的数目
- model_devi_jobs["MaxStep"]，int，CALYPSO运行需要的最大代数
- model_devi_jobs["ICode"]，int, 局域优化软件的选择，15为使用DP优化结构
- model_devi_jobs["Split"]，str，Split模式表示将产生结构与优化结构分离，dpgen中必须为"T"
- model_devi_jobs["PSTRESS"]，List[float]，压力，单位KBar
- model_devi_jobs["fmax"]，float，DP结构优化收敛标准，结构中原子所受力的最大值小于fmax时优化停止
- model_devi_jobs["VSC"]，str，是否进行变组分结构产生
- model_devi_jobs["MaxNumAtom"]，int，变组分产生的结构最大允许原子数
- model_devi_jobs["CtrlRange"]，List[List[int]]，变组分结构预测时，每种元素变化的范围[[1,2].[1,3]]
建议大家可以直接使用calypso_input_path直接指定input.dat来运行

代码

文本

双击即可修改

代码

文本

[35]

!cat MgAl.json

{
	"model_devi_engine":"calypso",
	"calypso_input_path":"./calypso_input",
    "model_devi_max_iter": 2,
	"vsc":false,
    "ratio_failed":0.2,
    "type_map": [
	"Mg",
    "Al"
    ],
    "mass_map": [
        27,
        64
    ],
    "init_data_prefix":"./data/trainingset",
    "init_data_sys": [
                   "Mg1Al1",
                   "Mg1Al11",
                   "Mg1Al13",
                   "Mg1Al14",
                   "Mg1Al16",
                   "Mg1Al17",
                   "Mg1Al18",
                   "Mg1Al19",
                   "Mg1Al2",
                   "Mg1Al20",
                   "Mg1Al3",
                   "Mg1Al4",
                   "Mg1Al6",
                   "Mg1Al7",
                   "Mg1Al8",
                   "Mg1Al9",
                   "Mg2Al10",
                   "Mg2Al12",
                   "Mg2Al13",
                   "Mg2Al15",
                   "Mg2Al16",
                   "Mg2Al17",
                   "Mg2Al18",
                   "Mg2Al2",
                   "Mg2Al20",
                   "Mg2Al3",
                   "Mg2Al4",
                   "Mg2Al5",
                   "Mg2Al6",
                   "Mg2Al7",
                   "Mg2Al8",
                   "Mg2Al9",
                   "Mg3Al1",
                   "Mg3Al10",
                   "Mg3Al11",
                   "Mg3Al12",
                   "Mg3Al13",
                   "Mg3Al14",
                   "Mg3Al15",
                   "Mg3Al16",
                   "Mg3Al17",
                   "Mg3Al2",
                   "Mg3Al20",
                   "Mg3Al3",
                   "Mg3Al4",
                   "Mg3Al5",
                   "Mg3Al6",
                   "Mg3Al7",
                   "Mg3Al8",
                   "Mg3Al9",
                   "Mg4Al11",
                   "Mg4Al13",
                   "Mg4Al14",
                   "Mg4Al15",
                   "Mg4Al16",
                   "Mg4Al17",
                   "Mg4Al18",
                   "Mg4Al19",
                   "Mg4Al2",
                   "Mg4Al20",
                   "Mg4Al3",
                   "Mg4Al6",
                   "Mg5Al10",
                   "Mg5Al11",
                   "Mg5Al12",
                   "Mg5Al16",
                   "Mg5Al18",
                   "Mg5Al3",
                   "Mg5Al4",
                   "Mg5Al5",
                   "Mg5Al7",
                   "Mg5Al8",
                   "Mg6Al1",
                   "Mg6Al11",
                   "Mg6Al12",
                   "Mg6Al13",
                   "Mg6Al14",
                   "Mg6Al17",
                   "Mg6Al18",
                   "Mg6Al19",
                   "Mg6Al2",
                   "Mg6Al20",
                   "Mg6Al3",
                   "Mg6Al6",
                   "Mg6Al7",
                   "Mg6Al8",
                   "Mg6Al9",
                   "Mg7Al10",
                   "Mg7Al12",
                   "Mg7Al14",
                   "Mg7Al15",
                   "Mg7Al19",
                   "Mg7Al2",
                   "Mg7Al20",
                   "Mg7Al3",
                   "Mg7Al4",
                   "Mg7Al5",
                   "Mg7Al6",
                   "Mg8Al11",
                   "Mg8Al12",
                   "Mg8Al13",
                   "Mg8Al15",
                   "Mg8Al16",
                   "Mg8Al17",
                   "Mg8Al18",
                   "Mg8Al19",
                   "Mg8Al2",
                   "Mg8Al3",
                   "Mg8Al4",
                   "Mg8Al5",
                   "Mg8Al6",
                   "Mg8Al7",
                   "Mg8Al8",
                   "Mg8Al9",
                   "Mg9Al12",
                   "Mg9Al14",
                   "Mg9Al15",
                   "Mg9Al17",
                   "Mg9Al18",
                   "Mg9Al19",
                   "Mg9Al2",
                   "Mg9Al20",
                   "Mg9Al4",
                   "Mg9Al5",
                   "Mg9Al6",
                   "Mg9Al7",
                   "Mg9Al8",
                   "Mg9Al9",
                   "Mg10Al1",
                   "Mg10Al10",
                   "Mg10Al11",
                   "Mg10Al12",
                   "Mg10Al13",
                   "Mg10Al14",
                   "Mg10Al15",
                   "Mg10Al16",
                   "Mg10Al2",
                   "Mg10Al20",
                   "Mg10Al3",
                   "Mg10Al4",
                   "Mg10Al6",
                   "Mg10Al7",
                   "Mg10Al8",
                   "Mg10Al9",
                   "Mg11Al1",
                   "Mg11Al10",
                   "Mg11Al12",
                   "Mg11Al13",
                   "Mg11Al14",
                   "Mg11Al15",
                   "Mg11Al16",
                   "Mg11Al17",
                   "Mg11Al19",
                   "Mg11Al2",
                   "Mg11Al20",
                   "Mg11Al3",
                   "Mg11Al4",
                   "Mg11Al5",
                   "Mg11Al6",
                   "Mg11Al7",
                   "Mg11Al8",
                   "Mg11Al9",
                   "Mg12Al1",
                   "Mg12Al10",
                   "Mg12Al11",
                   "Mg12Al12",
                   "Mg12Al13",
                   "Mg12Al14",
                   "Mg12Al15",
                   "Mg12Al16",
                   "Mg12Al17",
                   "Mg12Al19",
                   "Mg12Al20",
                   "Mg12Al3",
                   "Mg12Al5",
                   "Mg12Al6",
                   "Mg12Al8",
                   "Mg13Al11",
                   "Mg13Al12",
                   "Mg13Al13",
                   "Mg13Al14",
                   "Mg13Al15",
                   "Mg13Al16",
                   "Mg13Al17",
                   "Mg13Al18",
                   "Mg13Al19",
                   "Mg13Al2",
                   "Mg13Al20",
                   "Mg13Al3",
                   "Mg13Al5",
                   "Mg13Al6",
                   "Mg13Al7",
                   "Mg13Al8",
                   "Mg13Al9",
                   "Mg14Al12",
                   "Mg14Al13",
                   "Mg14Al16",
                   "Mg14Al17",
                   "Mg14Al18",
                   "Mg14Al19",
                   "Mg14Al3",
                   "Mg14Al4",
                   "Mg14Al5",
                   "Mg14Al8",
                   "Mg14Al9",
                   "Mg15Al10",
                   "Mg15Al11",
                   "Mg15Al16",
                   "Mg15Al17",
                   "Mg15Al18",
                   "Mg15Al19",
                   "Mg15Al2",
                   "Mg15Al20",
                   "Mg15Al3",
                   "Mg15Al4",
                   "Mg15Al5",
                   "Mg15Al7",
                   "Mg15Al9",
                   "Mg16Al1",
                   "Mg16Al12",
                   "Mg16Al13",
                   "Mg16Al14",
                   "Mg16Al15",
                   "Mg16Al16",
                   "Mg16Al17",
                   "Mg16Al19",
                   "Mg16Al2",
                   "Mg16Al3",
                   "Mg16Al4",
                   "Mg16Al6",
                   "Mg16Al7",
                   "Mg17Al1",
                   "Mg17Al11",
                   "Mg17Al12",
                   "Mg17Al13",
                   "Mg17Al14",
                   "Mg17Al15",
                   "Mg17Al16",
                   "Mg17Al17",
                   "Mg17Al18",
                   "Mg17Al19",
                   "Mg17Al2",
                   "Mg17Al20",
                   "Mg17Al3",
                   "Mg17Al4",
                   "Mg17Al5",
                   "Mg17Al6",
                   "Mg17Al7",
                   "Mg17Al8",
                   "Mg17Al9",
                   "Mg18Al11",
                   "Mg18Al12",
                   "Mg18Al13",
                   "Mg18Al14",
                   "Mg18Al15",
                   "Mg18Al16",
                   "Mg18Al17",
                   "Mg18Al19",
                   "Mg18Al2",
                   "Mg18Al20",
                   "Mg18Al3",
                   "Mg18Al5",
                   "Mg18Al6",
                   "Mg18Al7",
                   "Mg18Al8",
                   "Mg18Al9",
                   "Mg19Al1",
                   "Mg19Al10",
                   "Mg19Al11",
                   "Mg19Al15",
                   "Mg19Al16",
                   "Mg19Al17",
                   "Mg19Al18",
                   "Mg19Al2",
                   "Mg19Al20",
                   "Mg19Al4",
                   "Mg19Al5",
                   "Mg19Al6",
                   "Mg19Al8",
                   "Mg19Al9",
                   "Mg20Al11",
                   "Mg20Al12",
                   "Mg20Al17",
                   "Mg20Al4",
                   "Mg20Al5",
                   "Mg20Al6",
                   "Mg20Al7",
                   "Mg20Al9"
    ],
    "_comment": " that's all ",
    "sys_configs":[""],

    "training_init_model": false,
    "numb_models": 4,
    "default_training_param": {
    "model": {
        "descriptor": {
            "type": "se_e2_a",
            "sel": [
                500,
                1500
            ],
            "rcut_smth": 2.0,
            "rcut": 6.0,
            "neuron": [
                25,
                50,
                100
            ],
            "resnet_dt": false,
            "axis_neuron": 12,
            "type_one_side": true,
            "seed": 1801819940,
            "_activation_function": "tanh"
        },
        "fitting_net": {
            "neuron": [
                240,
                240,
                240
            ],
            "resnet_dt": true,
            "seed": 2375417769
        },
        "type_map": [
            "Mg",
            "Al"
        ]
    },
    "learning_rate": {
        "type": "exp",
        "start_lr": 0.001,
        "decay_steps": 2000
    },
    "loss": {
        "start_pref_e": 0.02,
        "limit_pref_e": 1,
        "start_pref_f": 1000,
        "limit_pref_f": 1,
        "start_pref_v": 0,
        "limit_pref_v": 0 
    },
    "training": {
        "numb_steps": 2000,
        "seed": 3982377700,
        "disp_file": "lcurve.out",
        "disp_freq": 20,
        "numb_test": 4,
        "save_freq": 2000,
        "save_ckpt": "model.ckpt",
        "disp_training": true,
        "time_training": true,
        "profiling": false,
        "profiling_file": "timeline.json",
        "_set_prefix": "set"
    }
	},
    "model_devi_dt": 0.002,
    "model_devi_skip": 0,
    "model_devi_f_trust_lo": 0.1,
    "model_devi_f_trust_hi": 0.3,
    "model_devi_e_trust_lo": 10000000000.0,
    "model_devi_e_trust_hi": 10000000000.0,
    "model_devi_clean_traj": true,
    "model_devi_jobs": [
    ],

    "fp_style": "vasp",
    "shuffle_poscar": false,
    "fp_task_max": 10,
    "fp_task_min": 1,
    "fp_pp_path": "./vasp_input",
    "fp_pp_files": [
	    "POTCAR.Mg",
        "POTCAR.Al"
    ],
    "fp_incar":"./vasp_input/INCAR"
}

代码

文本

2.3 calypso输入文件：input.dat

input.dat文件是CALYPSO运行时的主要控制文件，结构的产生方式需要在文件中说明，详细参数解释可以参考：https://iccms-calypso.github.io/CALYPSO-Fortran/

代码

文本

[38]

!cat ./calypso_input/input.dat

################################ The Basic Parameters of CALYPSO ################################
# A string of one or several words contain a descriptive name of the system (max. 40 characters).
SystemName =MgAl 
# Number of different atomic species in the simulation. 
NumberOfSpecies = 2
# Element symbols of the different chemical species.
NameOfAtoms =Mg Al
# Atomic Number of each chemical species.
AtomicNumber = 12 13 
# Number of atoms for each chemical species in one formula unit. 
NumberOfAtoms =  1 1  
# The range of formula unit per cell in your simulation. 
NumberOfFormula = 1 1  
# The volume per formula unit. Unit is in angstrom^3.
# Volume=20
# Minimal distance between atoms of each chemical species. Unit is in angstrom.
@DistanceOfIon 
1.48 1.44 
1.44 1.41 
@End
# It determines which algorithm should be adopted in the simulation.
Ialgo = 2
# Ialgo = 1 for Global PSO
# Ialgo = 2 for Local PSO (default value)
# The proportion of the structures generated by PSO.
PsoRatio = 0.0
# The popu2ation size. Normally, it has a larger number for larger systems.
PopSize = 50
#It determines which method should be adopted in generation the random structure.                              
GenType= 1 
# 1 under symmetric constraints
# 2 grid method for large system
# 3 and 4 core grow method 
# 0 combination of all method # If GenType=3 or 4, it determined the small unit to grow the whole structure
# It determines which local optimization method should be interfaced in the simulation.
ICode= 15
# ICode= 1 interfaced with VASP
# ICode= 2 interfaced with SIESTA
# ICode= 3 interfaced with GULP
# The number of lbest for local PSO
NumberOfLbest=4
# The Number of local optimization for each structure.
NumberOfLocalOptim= 3
# The command to perform local optimiztion calculation (e.g., VASP, SIESTA) on your computer.
Command = sh submit.sh
MaxTime = 9000 
# The Max step for iteration
MaxStep = 5
# If True, a previous calculation will be continued.
PickUp=F
# At which step will the previous calculation be picked up.
PickStep = 1
# If True, the local optimizations performed by parallel
Parallel= F
# The number node for parallel 
NumberOfParallel=4
#LMC = F
Split = T
##### The Parameters For Variational Stoichiometry  ##############
## If True, Variational Stoichiometry structure prediction is performed
VSC=T
## VSCEnergy= 9.68045866 1.00102117 
## The Max Number of Atoms in unit cell
MaxNumAtom=40
## The Variation Range for each type atom 
@CtrlRange         
1 20
1 20
@end
###################End Parameters for VSC ##########################

代码

文本

2.4 运行

在命令行中可以执行以下命令开始运行：

dpgen run MgAl.json machine.json

注意：在计算自己的科研体系时，此步可能会耗时比较久，可以提交到后台运行，即：nohup dpgen run MgAl.json machine.json &。我们将这句话写进了run.sh中，因此可以直接运行run.sh。

由于dpgen+calypso的迭代需要运行calypso，目前针对calypso的演化产生结构的过程和针对探索到的结构进行model devi的计算是在本地运行，因此在运行例子的时候不能关闭节点。

代码

文本

[39]

!bash run.sh #完成计算大概需要1-2个小时

代码

文本

[40]

MgAl.json       data/      iter.000000/  out     vasp_input/
calypso_input/  dpgen.log  machine.json  run.sh

代码

文本

[42]

cat dpgen.log

2024-03-29 10:27:54,232 - INFO : start running
2024-03-29 10:27:54,246 - INFO : =============================iter.000000==============================
2024-03-29 10:27:54,246 - INFO : -------------------------iter.000000 task 00--------------------------
2024-03-29 10:28:11,350 - INFO : -------------------------iter.000000 task 01--------------------------

代码

文本

3. 基于DP加速的CALYPSO结构预测

在DPGEN+CALYPSO迭代收敛后，我们会针对迭代后的模型长训，利用长训后的模型进行DP加速的CALYPSO结构预测。

基于DP加速的CALYPSO结构预测分为两种方式：

单节点模式：正常的CALYPSO运行模式，此模式下结构产生和优化是串行的，可能速度会较多节点split模式慢一些，适合(cpu/gpu)资源不充足的情况。
多节点split模式：在开启split模式后，CALYPSO的产生结构和优化结构部分分离，在资源充足的情况下，我们可以每个结构单独占据一个节点，大大加速了结构预测流程，由于bohrium平台资源充足，因此也更推荐使用这种方式。

以上两种方式同样适用于自有集群，本节例子的目的是为了给大家提供在自有集群运行任务的模板，不在实际运行。

关于如何更方便的在borhium平台使用DP加速的CALYPSO结构预测可在notebook1和notebook2中进一步学习。

代码

文本

3.1 单节点模式

代码

文本

[5]

cp -r /bohr/dp-calypso-example-ivbp/v1/dp-calypso-example/ /personal

代码

文本

[6]

cd /personal/dp-calypso-example

/personal/dp-calypso-example

代码

文本

[7]

ls *

nosplit_calypso_dp_example:
calypso_check_outcar.py*  graph.pb*   run.sh*
calypso_run_opt.py*       input.dat*  submit.sh*

split_calypso_dp_example:
calypso_check_outcar.py*  evo.dispatcher.py*  input.dat*     resources.json*
calypso_run_opt.py*       graph.pb*           machine.json*  run.sh*

代码

文本

[10]

cat nosplit_calypso_dp_example/run.sh

#!/bin/bash

calypso.x > caly.log 2>&1

代码

文本

[11]

cat nosplit_calypso_dp_example/submit.sh

#!/bin/bash

/opt/deepmd/deepmd-kit-222/bin/python calypso_run_opt.py || /opt/deepmd/deepmd-kit-222/bin/python calypso_check_outcar.py > log 2>&1

代码

文本

在单节点模式中，需要以下文件：

calypso_run_opt.py: 调用ASE结合DP进行结构优化的控制脚本
calypso_check_outcar.py: 处理优化失败情况的脚本
graph.pb: 训练好的模型
input.dat: calypso的输入参数
submit.sh: 优化结构的命令，内容一般为：python calypso_run_opt.py || python calypso_check_outcar.py
run.sh: 任务提交脚本，对于本地机器，可直接运行；对于任务调度系统，需要按照队列系统对应的脚本进行修改，其中运行的命令不变：calypso.x > caly.log

代码

文本

运行命令为：bash run.sh

代码

文本

3.2 多节点split模式

多节点模式主要通过evo.dispatcher.py脚本调用dpdispatcher对结构进行批量提交，由于dpdispatcher在比较灵活，因此可以实现多种资源的调用。

代码

文本

[12]

ls *

nosplit_calypso_dp_example:
calypso_check_outcar.py*  graph.pb*   run.sh*
calypso_run_opt.py*       input.dat*  submit.sh*

split_calypso_dp_example:
calypso_check_outcar.py*  evo.dispatcher.py*  input.dat*     resources.json*
calypso_run_opt.py*       graph.pb*           machine.json*  run.sh*

代码

文本

在多节点split模式中，需要以下文件：

calypso_run_opt.py: 调用ASE结合DP进行结构优化的控制脚本
calypso_check_outcar.py: 处理优化失败情况的脚本
graph.pb: 训练好的模型
input.dat: calypso的输入参数
run.sh: 任务提交脚本，对于本地机器，可直接运行；对于任务调度系统，需要修改machine.json和resoureces.json定义需要的机器和资源信息，该脚本可以提交到计算节点（前提是计算节点可以继续向队列系统提交任务），也可以在主节点运行。
machine.json: 控制使用的机器信息，详细参数解释见dpdispatcher的文档
resources.json: 具体使用的资源信息，详细参数解释见dpdispatcher的文档
evo.dispatcher.py: 流程控制脚本，需要对里面的参数做一些修改

代码

文本

[15]

cat split_calypso_dp_example/evo.dispatcher.py

#/usr/bin/env python
from dpdispatcher import Machine, Resources, Task, Submission
import os, sys, shutil, glob
from pathlib import Path

# os.environ["CUDA_VISIBLE_DEVICES"]="0"
MaxStep = 50
PopSize = 200
# N_INCAR = 1

# dp
python = "/opt/deepmd/deepmd-kit-222/bin/python"

forward_files = ['calypso_run_opt.py', 'calypso_check_outcar.py', 'input.dat', 'POSCAR']
backward_files = ['OUTCAR', 'CONTCAR', 'traj.traj', 'model_devi.out']
forward_common_files=['graph.pb']

command = f'{python} calypso_run_opt.py || {python} calypso_check_outcar.py > model_devi.out 2>&1'

machine = Machine.load_from_json('./machine.json')
resources = Resources.load_from_json('./resources.json')


os.system("./calypso.x")
for step in range(1, MaxStep+1):

    task_list = []

    for pop in range(1, PopSize + 1):
        task_dir = "data/step%03d.pop%03d"% (step,pop)
        Path(task_dir).mkdir(parents=True, exist_ok=True)
        shutil.copyfile("POSCAR_%d"%pop, os.path.join(task_dir, "POSCAR"))
        shutil.copyfile("POSCAR_%d"%pop, os.path.join(task_dir, "POSCAR.ori"))
        # dp
        for file_name in forward_files[:3]:
            shutil.copyfile(file_name , os.path.join(task_dir, file_name))

        task = Task(
            command=command,
            task_work_path=task_dir,
            forward_files=forward_files,
            backward_files=backward_files,
        )
        task_list.append(task)

    submission = Submission(
            work_base = os.getcwd(),
            machine= machine,
            resources =resources,
            task_list = task_list,
            forward_common_files=forward_common_files
            )
    submission.run_submission()

    for pop in range(1, PopSize + 1):
        task_dir = "data/step%03d.pop%03d"% (step,pop)
        shutil.copyfile(os.path.join(task_dir, "CONTCAR"), "CONTCAR_%d"%pop )
        shutil.copyfile(os.path.join(task_dir, "OUTCAR"), "OUTCAR_%d"%pop)

    os.system("./calypso.x")

代码

文本

在evo.dispatcher.py脚本中，需要修改MaxStep， PopSize，python变量:

MaxStep和PopSize需要和input.dat中的参数保持一致
python需要指定deepmd-kit的python路径

代码

文本

[14]

cat split_calypso_dp_example/run.sh

#!/bin/bash
#SBATCH --job-name=split-mode
#SBATCH --mem-per-cpu=2gb
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=1
#SBATCH --output=%j.log
#SBATCH --partition=hebhcnormal01

# load the environment
module purge
# module load apps/PyTorch/chatglm2_py38/pytorch1.13-dtk23.04-py38

source /public/software/profile.d/compiler_intel-compiler-2017.5.239.sh 
source /public/software/profile.d/mpi_intelmpi-2018.4.274.sh 

# ./calypso.x > caly.log 2>&1
/public/share/acsp3lax5q/miniconda3/envs/workshop2023/bin/python evo.dispatcher.py > caly.log 2>&1

代码

文本

可通过提交run.sh脚本运行，也可以通过直接在命令行运行evo.dispatcher.py脚本运行。

代码

文本

dpgen

CALYPSO

结构搜索

dpgenCALYPSO结构搜索

已赞2

本文被以下合集收录

bohr61096f

更新于 2024-08-27

47 篇0 人关注

DPMD

RamonMi

更新于 2024-07-29

14 篇0 人关注