Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
快速上手使用 DP-GEN + CALYPSO | 加速新材料发现
dpgen
CALYPSO
结构搜索
dpgenCALYPSO结构搜索
rainvibe
发布于 2024-03-25
推荐镜像 :dp222-dpgen-calypso:v0.1
推荐机型 :c2_m4_cpu
赞 2
2
dpgen-calypso-MgAl(v1)
dp-calypso-example(v1)

快速上手使用 DP-GEN + CALYPSO 加速新材料发现

代码
文本

©️ Copyright 2024 @ Authors
作者:王振雨(wzy@calypso.cn)
日期:2024-03-25
共享协议:本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
快速开始:你可以点击界面上方蓝色按钮 开始连接 ,选择 `dp222-dpgen-calypso:v0.1` 镜像及`c2_m4_cpu`节点配置稍等片刻即可运行。

代码
文本

目标

使用DPGEN结合CALYPSO进行势能面的快速探索并构建势函数

在学习本教程后,你将能够:

  • 学会安装dpgen和dpdata
  • 使用dp结合CALYPSO进行MLP加速的结构预测
  • 使用dpgen结合CALYPSO进行适用于结构预测的势函数的构建

阅读该教程【最多】约需 10 分钟(算例计算完成大概需要1个小时),让我们开始吧!

代码
文本

背景

本教程将以Mg-Al二元合金体系为例,介绍使用DPGEN结合CALYPSO进行Mg-Al势能面的探索和DP势函数的训练。支持使用CALYPSO的变配比结构搜索进行势能面采样,目前仅支持晶体。

代码
文本

1. 准备工作

代码
文本

1.1 环境准备

镜像中已经安装了DPGEN+CALYPSO+DeepMD,为了确保运行正确,我们可以先设置环境变量。

代码
文本
[1]
import os
os.environ["PATH"]=os.getenv("PATH")+":/opt/deepmd/deepmd-kit-222/bin"
os.getenv("PATH")
'/root/.local/bin:/CALYPSO-Bohrium/utils:/opt/intel/oneapi/mkl/2023.2.0/bin/intel64:/opt/intel/oneapi/mpi/2021.7.1//libfabric/bin:/opt/intel/oneapi/mpi/2021.7.1//bin:/opt/intel/oneapi/debugger/2021.7.1/gdb/intel64/bin:/opt/intel/oneapi/compiler/2022.2.1/linux/bin/intel64:/opt/intel/oneapi/compiler/2022.2.1/linux/bin:/opt/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/deepmd/deepmd-kit-222/bin'
代码
文本
[31]
!which dp
/opt/deepmd/deepmd-kit-222/bin/dp
代码
文本
[32]
!dpgen -h #查看是否安装成功
DeepModeling
------------
Version: 0.10.1.dev310+g4719ba1
Path:    /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/dpgen

Dependency
------------
     numpy     1.25.0   /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/numpy
    dpdata     0.2.15   /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/dpdata
  pymatgen            unknown version or path
     monty  2024.2.26   /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/monty
       ase     3.22.1   /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/ase
  paramiko      3.4.0   /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/paramiko
 custodian  2024.3.12   /opt/deepmd/deepmd-kit-222/lib/python3.10/site-packages/custodian

Reference
------------
Please cite:
Yuzhi Zhang, Haidi Wang, Weijie Chen, Jinzhe Zeng, Linfeng Zhang, Han Wang, and Weinan E,
DP-GEN: A concurrent learning platform for the generation of reliable deep learning
based potential energy models, Computer Physics Communications, 2020, 107206.
------------

Description
------------
usage: dpgen [-h]
             {init_surf,init_bulk,auto_gen_param,init_reaction,run,run/report,collect,simplify,autotest,db}
             ...

dpgen is a convenient script that uses DeepGenerator to prepare initial data,
drive DeepMDkit and analyze results. This script works based on several sub-
commands with their own options. To see the options for the sub-commands, type
"dpgen sub-command -h".

positional arguments:
  {init_surf,init_bulk,auto_gen_param,init_reaction,run,run/report,collect,simplify,autotest,db}
    init_surf           Generating initial data for surface systems.
    init_bulk           Generating initial data for bulk systems.
    auto_gen_param      auto gen param.json
    init_reaction       Generating initial data for reactive systems.
    run                 Main process of Deep Potential Generator.
    run/report          Report the systems and the thermodynamic conditions of
                        the labeled frames.
    collect             Collect data.
    simplify            Simplify data.
    autotest            Auto-test for Deep Potential.
    db                  Collecting data from DP-GEN.

options:
  -h, --help            show this help message and exit
代码
文本

如果是在自有集群上使用dpgen+calypso,可按照以下步骤安装:

  1. 使用conda安装deepmdkit:conda create -n deepmd_222_cpu deepmd-kit=2.2.2=*cpu libdeepmd=2.2.2=*cpu lammps horovod -c https://conda.deepmodeling.com -c defaults or conda create -n deepmd_222_gpu deepmd-kit=2.2.2=*gpu libdeepmd=2.2.2=*gpu lammps cudatoolkit=11.6 horovod -c https://conda.deepmodeling.com -c defaults
  2. 激活环境:conda activate deepmd_222_gpu
  3. 下载dpgen:pip install git+https://github.com/wangzyphysics/dpgen.git@devel
  4. 获取CALYPSO:http://calypso.cn/getting-calypso/

至此环境准备完毕

代码
文本

1.2 下载算例

代码
文本
[15]
#进入/personal目录,从数据集中下载算例
!cp -r /bohr/dpgen-calypso-ia74/v1 /personal/dpgen-calypso-example
sh: 0: getcwd() failed: No such file or directory
代码
文本
[2]
cd /personal/dpgen-calypso-example
/personal/dpgen-calypso-example
代码
文本
[17]
ls
dpgen-calypso.zip*
代码
文本
[19]
!unzip dpgen-calypso.zip
已隐藏输出
代码
文本
[34]
cd dpgen-calypso
/personal/dpgen-calypso-example/dpgen-calypso
代码
文本
[26]
ls *
MgAl.json  machine.json  run.sh

calypso_input:
calypso.x*  input.dat  input.dat.Al  input.dat.Mg  input.dat.MgAl

data:
trainingset/

vasp_input:
INCAR  POTCAR  POTCAR.Al  POTCAR.Mg
代码
文本

2. dpgen+calypso参数文件说明及运行

代码
文本

当前目录中共有六个文件(夹),其中:

  • MgAl.json: dpgen的主要控制文件,包含了训练、采样和dft计算的参数
  • machine.json: 计算资源的控制文件
  • run.sh: 提交任务脚本
  • calypso_input: 目录中包含了calypso的输入参数文件input.dat
  • data: 初始训练集,由500个CALYPSO产生的随机结构的单点组成
  • vasp_input: 第一性原理软件VASP的输入文件
代码
文本

2.1 dpgen输入文件:machine.json

machine.json为计算资源配置文件,参数的详细说明请参考dpdispatcher的文档

后续使用时需注意:

  • command为执行的命令,已经帮大家写好,需要注意,在model devi部分除command外,还需要提供calypso的绝对路径calypso_path和deepmd的python路径deepmdkit_python(使用DP对结构优化并计算model deviation)
  • email/password为Bohrium平台的账号和密码
  • program_id为相应的项目号
  • scass_type为调用的机器型号
  • image_name为调用的镜像名
  • group_size为每个计算节点运行的任务数
代码
文本
[27]
cat machine.json
{
  "api_version": "1.0",
  "deepmd_version": "2.2.2",

  "train" :[
    {
      "command": "/opt/deepmd/deepmd-kit-222/bin/dp",
      "machine": {
        "batch_type": "Bohrium",
        "context_type": "BohriumContext",
        "local_root" : "./",
        "remote_profile":{
          "email": "",
          "password": "",
          "program_id": 00000,
            "keep_backup":true,
            "input_data":{
                "log_file": "00*/train.log",
                "grouped":true,
                "job_name": "dpgen_train_job",
                "disk_size": 100,
                "scass_type":"c8_m32_1 * NVIDIA V100",
                "checkpoint_files":["00*/checkpoint","00*/model.ckpt*"],
                "checkpoint_time":30,
                "platform": "ali",
                "job_type": "container",
                "image_address":"registry.dp.tech/dptech/prod-11265/dp222-dpgen-calypso:v0.1",
                "on_demand":0
            }
        }
      },
      "resources": {
        "number_node": 1,
        "cpu_per_node": 8,
        "gpu_per_node": 1,
        "queue_name": "V100_8_32",
        "group_size": 1,
          "custom_flags": [],
          "strategy": {"if_cuda_multi_devices": true},
          "para_deg": 3,
          "source_list": []
      }
    }],

  "model_devi":[
    {
      "_comment_1": "calypso_path is the local calypso path",
      "calypso_path":"/usr/local/bin/",
      "_comment_2": "deepmdkit_python is the remote machine python path",
      "deepmdkit_python":"/opt/deepmd/deepmd-kit-222/bin/python",
      "command": "",
      "machine": {
        "batch_type": "Bohrium",
        "context_type": "BohriumContext",
        "local_root" : "./",
        "remote_profile":{
            "email": "",
            "password": "",
            "program_id": 0000,
            "keep_backup":true,
            "input_data":{
              "log_file": "*/model_devi.log",
              "grouped":true,
              "job_name": "dpgen_model_devi_job",
              "disk_size": 200,
              "scass_type":"c8_m32_1 * NVIDIA V100",
              "platform": "ali",
              "job_type": "container",
              "image_address":"registry.dp.tech/dptech/prod-11265/dp222-dpgen-calypso:v0.1",
              "on_demand":0
            }
        }
      },
      "resources": {
        "number_node": 1,
        "cpu_per_node": 8,
        "gpu_per_node": 1,
        "queue_name": "V100_8_32",
        "group_size": 50,
          "source_list": []
      }
    }],

  "fp":[
    {
      "command": "ulimit -s unlimited; mpirun -np 16 vasp_std",
      "machine": {
        "batch_type": "Bohrium",
        "context_type": "BohriumContext",
        "local_root" : "./",
        "remote_profile":{
            "email": "",
            "password": "",
            "program_id": 0000,
            "input_data":{
              "api_version":2,
              "log_file": "task*/fp.log",
              "grouped":true,
              "job_name": "dpgen_fp_job",
              "disk_size": 100,
              "scass_type":"c16_m32_cpu",
              "platform": "ali",
              "job_type":"container",
              "image_address":"registry.dp.tech/dptech/vasp:5.4.4-calypso",
              "on_demand": 0
            }
        }
      },
      "resources": {
        "number_node": 1,
        "cpu_per_node": 8,
        "gpu_per_node": 0,
        "queue_name": "CPU",
        "group_size": 10,
        "source_list": ["/opt/intel/oneapi/setvars.sh"]
      }
    }
]
}
代码
文本
[29]
%%writefile machine.json
{
"api_version": "1.0",
"deepmd_version": "2.2.2",

"train" :[
{
"command": "/opt/deepmd/deepmd-kit-222/bin/dp",
"machine": {
"batch_type": "Bohrium",
"context_type": "BohriumContext",
"local_root" : "./",
"remote_profile":{
"email": "wzy@calypso.cn",
"password": "xxxx",
"program_id": xxxxx,
"keep_backup":true,
"input_data":{
"log_file": "00*/train.log",
"grouped":true,
"job_name": "dpgen_train_job",
"disk_size": 100,
"scass_type":"c8_m32_1 * NVIDIA V100",
"checkpoint_files":["00*/checkpoint","00*/model.ckpt*"],
"checkpoint_time":30,
"platform": "ali",
"job_type": "container",
"image_address":"registry.dp.tech/dptech/prod-11265/dp222-dpgen-calypso:v0.1",
"on_demand":0
}
}
},
"resources": {
"number_node": 1,
"cpu_per_node": 8,
"gpu_per_node": 1,
"queue_name": "V100_8_32",
"group_size": 1,
"custom_flags": [],
"strategy": {"if_cuda_multi_devices": true},
"para_deg": 3,
"source_list": []
}
}],

"model_devi":[
{
"_comment_1": "calypso_path is the local calypso path",
"calypso_path":"/usr/local/bin/",
"_comment_2": "deepmdkit_python is the remote machine python path",
"deepmdkit_python":"/opt/deepmd/deepmd-kit-222/bin/python",
"command": "",
"machine": {
"batch_type": "Bohrium",
"context_type": "BohriumContext",
"local_root" : "./",
"remote_profile":{
"email": "wzy@calypso.cn",
"password": "zhenyu1234!",
"program_id": 12361,
"keep_backup":true,
"input_data":{
"log_file": "*/model_devi.log",
"grouped":true,
"job_name": "dpgen_model_devi_job",
"disk_size": 200,
"scass_type":"c8_m32_1 * NVIDIA V100",
"platform": "ali",
"job_type": "container",
"image_address":"registry.dp.tech/dptech/prod-11265/dp222-dpgen-calypso:v0.1",
"on_demand":0
}
}
},
"resources": {
"number_node": 1,
"cpu_per_node": 8,
"gpu_per_node": 1,
"queue_name": "V100_8_32",
"group_size": 50,
"source_list": []
}
}],

"fp":[
{
"command": "ulimit -s unlimited; mpirun -np 16 vasp_std",
"machine": {
"batch_type": "Bohrium",
"context_type": "BohriumContext",
"local_root" : "./",
"remote_profile":{
"email": "wzy@calypso.cn",
"password": "zhenyu1234!",
"program_id": 12361,
"input_data":{
"api_version":2,
"log_file": "task*/fp.log",
"grouped":true,
"job_name": "dpgen_fp_job",
"disk_size": 100,
"scass_type":"c16_m32_cpu",
"platform": "ali",
"job_type":"container",
"image_address":"registry.dp.tech/dptech/vasp:5.4.4-calypso",
"on_demand": 0
}
}
},
"resources": {
"number_node": 1,
"cpu_per_node": 8,
"gpu_per_node": 0,
"queue_name": "CPU",
"group_size": 10,
"source_list": ["/opt/intel/oneapi/setvars.sh"]
}
}
]
}
Overwriting machine.json
代码
文本

2.2 dpgen输入文件:MgAl.json

”MgAljson“为dpgen run的主要参数设置文件。

基于CALYPSO构建势函数需要注意以下几个参数:

  • model_devi_engine: str, 默认为lmp,需要替换为calypso.

  • calypso_input_path: str, CALYPSO的输入参数文件绝对路径,当提供该路径后,model_devi_jobs不起作用

  • model_devi_max_iter: int, 最大迭代次数,超过最大迭代次数后程序会停止,当calypso_input_path存在时起作用

  • model_devi_jobs: List[dict], 类似lmp控制采样的形式,可控制参数包括:

    • model_devi_jobs["times"], List[int], 列表中的数字代表iteration的index,该列表中的index会使用相同的calypso输入文件

    • model_devi_jobs["NameOfAtoms"],List[str], 目标结构的元素列表

    • model_devi_jobs["NumberOfAtoms"],List[int],一倍分子式中每种元素的原子数

    • model_devi_jobs["NumberOfFormula"],List[int],产生结构的分子式范围

    • model_devi_jobs["Volume"],List[float],一倍分子式的体积,A^3

    • model_devi_jobs["DistanceOfIon"],List[List[float]],不同元素之间允许的最短距离 Angstrom

    • model_devi_jobs["PsoRatio"],float,使用演化算法产生结构的比例

    • model_devi_jobs["PopSize"],int,每一代产生结构的数目

    • model_devi_jobs["MaxStep"],int,CALYPSO运行需要的最大代数

    • model_devi_jobs["ICode"],int, 局域优化软件的选择,15为使用DP优化结构

    • model_devi_jobs["Split"],str,Split模式表示将产生结构与优化结构分离,dpgen中必须为"T"

    • model_devi_jobs["PSTRESS"],List[float],压力,单位KBar

    • model_devi_jobs["fmax"],float,DP结构优化收敛标准,结构中原子所受力的最大值小于fmax时优化停止

    • model_devi_jobs["VSC"],str,是否进行变组分结构产生

    • model_devi_jobs["MaxNumAtom"],int,变组分产生的结构最大允许原子数

    • model_devi_jobs["CtrlRange"],List[List[int]],变组分结构预测时,每种元素变化的范围[[1,2].[1,3]]

    建议大家可以直接使用calypso_input_path直接指定input.dat来运行

代码
文本
双击即可修改
代码
文本
[35]
!cat MgAl.json
{
	"model_devi_engine":"calypso",
	"calypso_input_path":"./calypso_input",
    "model_devi_max_iter": 2,
	"vsc":false,
    "ratio_failed":0.2,
    "type_map": [
	"Mg",
    "Al"
    ],
    "mass_map": [
        27,
        64
    ],
    "init_data_prefix":"./data/trainingset",
    "init_data_sys": [
                   "Mg1Al1",
                   "Mg1Al11",
                   "Mg1Al13",
                   "Mg1Al14",
                   "Mg1Al16",
                   "Mg1Al17",
                   "Mg1Al18",
                   "Mg1Al19",
                   "Mg1Al2",
                   "Mg1Al20",
                   "Mg1Al3",
                   "Mg1Al4",
                   "Mg1Al6",
                   "Mg1Al7",
                   "Mg1Al8",
                   "Mg1Al9",
                   "Mg2Al10",
                   "Mg2Al12",
                   "Mg2Al13",
                   "Mg2Al15",
                   "Mg2Al16",
                   "Mg2Al17",
                   "Mg2Al18",
                   "Mg2Al2",
                   "Mg2Al20",
                   "Mg2Al3",
                   "Mg2Al4",
                   "Mg2Al5",
                   "Mg2Al6",
                   "Mg2Al7",
                   "Mg2Al8",
                   "Mg2Al9",
                   "Mg3Al1",
                   "Mg3Al10",
                   "Mg3Al11",
                   "Mg3Al12",
                   "Mg3Al13",
                   "Mg3Al14",
                   "Mg3Al15",
                   "Mg3Al16",
                   "Mg3Al17",
                   "Mg3Al2",
                   "Mg3Al20",
                   "Mg3Al3",
                   "Mg3Al4",
                   "Mg3Al5",
                   "Mg3Al6",
                   "Mg3Al7",
                   "Mg3Al8",
                   "Mg3Al9",
                   "Mg4Al11",
                   "Mg4Al13",
                   "Mg4Al14",
                   "Mg4Al15",
                   "Mg4Al16",
                   "Mg4Al17",
                   "Mg4Al18",
                   "Mg4Al19",
                   "Mg4Al2",
                   "Mg4Al20",
                   "Mg4Al3",
                   "Mg4Al6",
                   "Mg5Al10",
                   "Mg5Al11",
                   "Mg5Al12",
                   "Mg5Al16",
                   "Mg5Al18",
                   "Mg5Al3",
                   "Mg5Al4",
                   "Mg5Al5",
                   "Mg5Al7",
                   "Mg5Al8",
                   "Mg6Al1",
                   "Mg6Al11",
                   "Mg6Al12",
                   "Mg6Al13",
                   "Mg6Al14",
                   "Mg6Al17",
                   "Mg6Al18",
                   "Mg6Al19",
                   "Mg6Al2",
                   "Mg6Al20",
                   "Mg6Al3",
                   "Mg6Al6",
                   "Mg6Al7",
                   "Mg6Al8",
                   "Mg6Al9",
                   "Mg7Al10",
                   "Mg7Al12",
                   "Mg7Al14",
                   "Mg7Al15",
                   "Mg7Al19",
                   "Mg7Al2",
                   "Mg7Al20",
                   "Mg7Al3",
                   "Mg7Al4",
                   "Mg7Al5",
                   "Mg7Al6",
                   "Mg8Al11",
                   "Mg8Al12",
                   "Mg8Al13",
                   "Mg8Al15",
                   "Mg8Al16",
                   "Mg8Al17",
                   "Mg8Al18",
                   "Mg8Al19",
                   "Mg8Al2",
                   "Mg8Al3",
                   "Mg8Al4",
                   "Mg8Al5",
                   "Mg8Al6",
                   "Mg8Al7",
                   "Mg8Al8",
                   "Mg8Al9",
                   "Mg9Al12",
                   "Mg9Al14",
                   "Mg9Al15",
                   "Mg9Al17",
                   "Mg9Al18",
                   "Mg9Al19",
                   "Mg9Al2",
                   "Mg9Al20",
                   "Mg9Al4",
                   "Mg9Al5",
                   "Mg9Al6",
                   "Mg9Al7",
                   "Mg9Al8",
                   "Mg9Al9",
                   "Mg10Al1",
                   "Mg10Al10",
                   "Mg10Al11",
                   "Mg10Al12",
                   "Mg10Al13",
                   "Mg10Al14",
                   "Mg10Al15",
                   "Mg10Al16",
                   "Mg10Al2",
                   "Mg10Al20",
                   "Mg10Al3",
                   "Mg10Al4",
                   "Mg10Al6",
                   "Mg10Al7",
                   "Mg10Al8",
                   "Mg10Al9",
                   "Mg11Al1",
                   "Mg11Al10",
                   "Mg11Al12",
                   "Mg11Al13",
                   "Mg11Al14",
                   "Mg11Al15",
                   "Mg11Al16",
                   "Mg11Al17",
                   "Mg11Al19",
                   "Mg11Al2",
                   "Mg11Al20",
                   "Mg11Al3",
                   "Mg11Al4",
                   "Mg11Al5",
                   "Mg11Al6",
                   "Mg11Al7",
                   "Mg11Al8",
                   "Mg11Al9",
                   "Mg12Al1",
                   "Mg12Al10",
                   "Mg12Al11",
                   "Mg12Al12",
                   "Mg12Al13",
                   "Mg12Al14",
                   "Mg12Al15",
                   "Mg12Al16",
                   "Mg12Al17",
                   "Mg12Al19",
                   "Mg12Al20",
                   "Mg12Al3",
                   "Mg12Al5",
                   "Mg12Al6",
                   "Mg12Al8",
                   "Mg13Al11",
                   "Mg13Al12",
                   "Mg13Al13",
                   "Mg13Al14",
                   "Mg13Al15",
                   "Mg13Al16",
                   "Mg13Al17",
                   "Mg13Al18",
                   "Mg13Al19",
                   "Mg13Al2",
                   "Mg13Al20",
                   "Mg13Al3",
                   "Mg13Al5",
                   "Mg13Al6",
                   "Mg13Al7",
                   "Mg13Al8",
                   "Mg13Al9",
                   "Mg14Al12",
                   "Mg14Al13",
                   "Mg14Al16",
                   "Mg14Al17",
                   "Mg14Al18",
                   "Mg14Al19",
                   "Mg14Al3",
                   "Mg14Al4",
                   "Mg14Al5",
                   "Mg14Al8",
                   "Mg14Al9",
                   "Mg15Al10",
                   "Mg15Al11",
                   "Mg15Al16",
                   "Mg15Al17",
                   "Mg15Al18",
                   "Mg15Al19",
                   "Mg15Al2",
                   "Mg15Al20",
                   "Mg15Al3",
                   "Mg15Al4",
                   "Mg15Al5",
                   "Mg15Al7",
                   "Mg15Al9",
                   "Mg16Al1",
                   "Mg16Al12",
                   "Mg16Al13",
                   "Mg16Al14",
                   "Mg16Al15",
                   "Mg16Al16",
                   "Mg16Al17",
                   "Mg16Al19",
                   "Mg16Al2",
                   "Mg16Al3",
                   "Mg16Al4",
                   "Mg16Al6",
                   "Mg16Al7",
                   "Mg17Al1",
                   "Mg17Al11",
                   "Mg17Al12",
                   "Mg17Al13",
                   "Mg17Al14",
                   "Mg17Al15",
                   "Mg17Al16",
                   "Mg17Al17",
                   "Mg17Al18",
                   "Mg17Al19",
                   "Mg17Al2",
                   "Mg17Al20",
                   "Mg17Al3",
                   "Mg17Al4",
                   "Mg17Al5",
                   "Mg17Al6",
                   "Mg17Al7",
                   "Mg17Al8",
                   "Mg17Al9",
                   "Mg18Al11",
                   "Mg18Al12",
                   "Mg18Al13",
                   "Mg18Al14",
                   "Mg18Al15",
                   "Mg18Al16",
                   "Mg18Al17",
                   "Mg18Al19",
                   "Mg18Al2",
                   "Mg18Al20",
                   "Mg18Al3",
                   "Mg18Al5",
                   "Mg18Al6",
                   "Mg18Al7",
                   "Mg18Al8",
                   "Mg18Al9",
                   "Mg19Al1",
                   "Mg19Al10",
                   "Mg19Al11",
                   "Mg19Al15",
                   "Mg19Al16",
                   "Mg19Al17",
                   "Mg19Al18",
                   "Mg19Al2",
                   "Mg19Al20",
                   "Mg19Al4",
                   "Mg19Al5",
                   "Mg19Al6",
                   "Mg19Al8",
                   "Mg19Al9",
                   "Mg20Al11",
                   "Mg20Al12",
                   "Mg20Al17",
                   "Mg20Al4",
                   "Mg20Al5",
                   "Mg20Al6",
                   "Mg20Al7",
                   "Mg20Al9"
    ],
    "_comment": " that's all ",
    "sys_configs":[""],

    "training_init_model": false,
    "numb_models": 4,
    "default_training_param": {
    "model": {
        "descriptor": {
            "type": "se_e2_a",
            "sel": [
                500,
                1500
            ],
            "rcut_smth": 2.0,
            "rcut": 6.0,
            "neuron": [
                25,
                50,
                100
            ],
            "resnet_dt": false,
            "axis_neuron": 12,
            "type_one_side": true,
            "seed": 1801819940,
            "_activation_function": "tanh"
        },
        "fitting_net": {
            "neuron": [
                240,
                240,
                240
            ],
            "resnet_dt": true,
            "seed": 2375417769
        },
        "type_map": [
            "Mg",
            "Al"
        ]
    },
    "learning_rate": {
        "type": "exp",
        "start_lr": 0.001,
        "decay_steps": 2000
    },
    "loss": {
        "start_pref_e": 0.02,
        "limit_pref_e": 1,
        "start_pref_f": 1000,
        "limit_pref_f": 1,
        "start_pref_v": 0,
        "limit_pref_v": 0 
    },
    "training": {
        "numb_steps": 2000,
        "seed": 3982377700,
        "disp_file": "lcurve.out",
        "disp_freq": 20,
        "numb_test": 4,
        "save_freq": 2000,
        "save_ckpt": "model.ckpt",
        "disp_training": true,
        "time_training": true,
        "profiling": false,
        "profiling_file": "timeline.json",
        "_set_prefix": "set"
    }
	},
    "model_devi_dt": 0.002,
    "model_devi_skip": 0,
    "model_devi_f_trust_lo": 0.1,
    "model_devi_f_trust_hi": 0.3,
    "model_devi_e_trust_lo": 10000000000.0,
    "model_devi_e_trust_hi": 10000000000.0,
    "model_devi_clean_traj": true,
    "model_devi_jobs": [
    ],

    "fp_style": "vasp",
    "shuffle_poscar": false,
    "fp_task_max": 10,
    "fp_task_min": 1,
    "fp_pp_path": "./vasp_input",
    "fp_pp_files": [
	    "POTCAR.Mg",
        "POTCAR.Al"
    ],
    "fp_incar":"./vasp_input/INCAR"
}
代码
文本

2.3 calypso输入文件:input.dat

input.dat文件是CALYPSO运行时的主要控制文件,结构的产生方式需要在文件中说明,详细参数解释可以参考:https://iccms-calypso.github.io/CALYPSO-Fortran/

代码
文本
[38]
!cat ./calypso_input/input.dat
################################ The Basic Parameters of CALYPSO ################################
# A string of one or several words contain a descriptive name of the system (max. 40 characters).
SystemName =MgAl 
# Number of different atomic species in the simulation. 
NumberOfSpecies = 2
# Element symbols of the different chemical species.
NameOfAtoms =Mg Al
# Atomic Number of each chemical species.
AtomicNumber = 12 13 
# Number of atoms for each chemical species in one formula unit. 
NumberOfAtoms =  1 1  
# The range of formula unit per cell in your simulation. 
NumberOfFormula = 1 1  
# The volume per formula unit. Unit is in angstrom^3.
# Volume=20
# Minimal distance between atoms of each chemical species. Unit is in angstrom.
@DistanceOfIon 
1.48 1.44 
1.44 1.41 
@End
# It determines which algorithm should be adopted in the simulation.
Ialgo = 2
# Ialgo = 1 for Global PSO
# Ialgo = 2 for Local PSO (default value)
# The proportion of the structures generated by PSO.
PsoRatio = 0.0
# The popu2ation size. Normally, it has a larger number for larger systems.
PopSize = 50
#It determines which method should be adopted in generation the random structure.                              
GenType= 1 
# 1 under symmetric constraints
# 2 grid method for large system
# 3 and 4 core grow method 
# 0 combination of all method # If GenType=3 or 4, it determined the small unit to grow the whole structure
# It determines which local optimization method should be interfaced in the simulation.
ICode= 15
# ICode= 1 interfaced with VASP
# ICode= 2 interfaced with SIESTA
# ICode= 3 interfaced with GULP
# The number of lbest for local PSO
NumberOfLbest=4
# The Number of local optimization for each structure.
NumberOfLocalOptim= 3
# The command to perform local optimiztion calculation (e.g., VASP, SIESTA) on your computer.
Command = sh submit.sh
MaxTime = 9000 
# The Max step for iteration
MaxStep = 5
# If True, a previous calculation will be continued.
PickUp=F
# At which step will the previous calculation be picked up.
PickStep = 1
# If True, the local optimizations performed by parallel
Parallel= F
# The number node for parallel 
NumberOfParallel=4
#LMC = F
Split = T
##### The Parameters For Variational Stoichiometry  ##############
## If True, Variational Stoichiometry structure prediction is performed
VSC=T
## VSCEnergy= 9.68045866 1.00102117 
## The Max Number of Atoms in unit cell
MaxNumAtom=40
## The Variation Range for each type atom 
@CtrlRange         
1 20
1 20
@end
###################End Parameters for VSC ##########################
代码
文本

2.4 运行

在命令行中可以执行以下命令开始运行:

dpgen run MgAl.json machine.json

注意:在计算自己的科研体系时,此步可能会耗时比较久,可以提交到后台运行,即:nohup dpgen run MgAl.json machine.json &。我们将这句话写进了run.sh中,因此可以直接运行run.sh

由于dpgen+calypso的迭代需要运行calypso,目前针对calypso的演化产生结构的过程和针对探索到的结构进行model devi的计算是在本地运行,因此在运行例子的时候不能关闭节点。

代码
文本
[39]
!bash run.sh #完成计算大概需要1-2个小时
代码
文本
[40]
ls
MgAl.json       data/      iter.000000/  out     vasp_input/
calypso_input/  dpgen.log  machine.json  run.sh
代码
文本
[42]
cat dpgen.log
2024-03-29 10:27:54,232 - INFO : start running
2024-03-29 10:27:54,246 - INFO : =============================iter.000000==============================
2024-03-29 10:27:54,246 - INFO : -------------------------iter.000000 task 00--------------------------
2024-03-29 10:28:11,350 - INFO : -------------------------iter.000000 task 01--------------------------
代码
文本

3. 基于DP加速的CALYPSO结构预测

在DPGEN+CALYPSO迭代收敛后,我们会针对迭代后的模型长训,利用长训后的模型进行DP加速的CALYPSO结构预测。

基于DP加速的CALYPSO结构预测分为两种方式:

  1. 单节点模式:正常的CALYPSO运行模式,此模式下结构产生和优化是串行的,可能速度会较多节点split模式慢一些,适合(cpu/gpu)资源不充足的情况。
  2. 多节点split模式:在开启split模式后,CALYPSO的产生结构和优化结构部分分离,在资源充足的情况下,我们可以每个结构单独占据一个节点,大大加速了结构预测流程,由于bohrium平台资源充足,因此也更推荐使用这种方式。

以上两种方式同样适用于自有集群,本节例子的目的是为了给大家提供在自有集群运行任务的模板,不在实际运行。

关于如何更方便的在borhium平台使用DP加速的CALYPSO结构预测可在notebook1notebook2中进一步学习。

代码
文本

3.1 单节点模式

代码
文本
[5]
cp -r /bohr/dp-calypso-example-ivbp/v1/dp-calypso-example/ /personal
代码
文本
[6]
cd /personal/dp-calypso-example
/personal/dp-calypso-example
代码
文本
[7]
ls *
nosplit_calypso_dp_example:
calypso_check_outcar.py*  graph.pb*   run.sh*
calypso_run_opt.py*       input.dat*  submit.sh*

split_calypso_dp_example:
calypso_check_outcar.py*  evo.dispatcher.py*  input.dat*     resources.json*
calypso_run_opt.py*       graph.pb*           machine.json*  run.sh*
代码
文本
[10]
cat nosplit_calypso_dp_example/run.sh
#!/bin/bash

calypso.x > caly.log 2>&1

代码
文本
[11]
cat nosplit_calypso_dp_example/submit.sh
#!/bin/bash

/opt/deepmd/deepmd-kit-222/bin/python calypso_run_opt.py || /opt/deepmd/deepmd-kit-222/bin/python calypso_check_outcar.py > log 2>&1

代码
文本

在单节点模式中,需要以下文件:

  • calypso_run_opt.py: 调用ASE结合DP进行结构优化的控制脚本
  • calypso_check_outcar.py: 处理优化失败情况的脚本
  • graph.pb: 训练好的模型
  • input.dat: calypso的输入参数
  • submit.sh: 优化结构的命令,内容一般为:python calypso_run_opt.py || python calypso_check_outcar.py
  • run.sh: 任务提交脚本,对于本地机器,可直接运行;对于任务调度系统,需要按照队列系统对应的脚本进行修改,其中运行的命令不变:calypso.x > caly.log
代码
文本

运行命令为:bash run.sh

代码
文本

3.2 多节点split模式

多节点模式主要通过evo.dispatcher.py脚本调用dpdispatcher对结构进行批量提交,由于dpdispatcher在比较灵活,因此可以实现多种资源的调用。

代码
文本
[12]
ls *
nosplit_calypso_dp_example:
calypso_check_outcar.py*  graph.pb*   run.sh*
calypso_run_opt.py*       input.dat*  submit.sh*

split_calypso_dp_example:
calypso_check_outcar.py*  evo.dispatcher.py*  input.dat*     resources.json*
calypso_run_opt.py*       graph.pb*           machine.json*  run.sh*
代码
文本

在多节点split模式中,需要以下文件:

  • calypso_run_opt.py: 调用ASE结合DP进行结构优化的控制脚本
  • calypso_check_outcar.py: 处理优化失败情况的脚本
  • graph.pb: 训练好的模型
  • input.dat: calypso的输入参数
  • run.sh: 任务提交脚本,对于本地机器,可直接运行;对于任务调度系统,需要修改machine.json和resoureces.json定义需要的机器和资源信息,该脚本可以提交到计算节点(前提是计算节点可以继续向队列系统提交任务),也可以在主节点运行。
  • machine.json: 控制使用的机器信息,详细参数解释见dpdispatcher的文档
  • resources.json: 具体使用的资源信息,详细参数解释见dpdispatcher的文档
  • evo.dispatcher.py: 流程控制脚本,需要对里面的参数做一些修改
代码
文本
[15]
cat split_calypso_dp_example/evo.dispatcher.py
#/usr/bin/env python
from dpdispatcher import Machine, Resources, Task, Submission
import os, sys, shutil, glob
from pathlib import Path

# os.environ["CUDA_VISIBLE_DEVICES"]="0"
MaxStep = 50
PopSize = 200
# N_INCAR = 1

# dp
python = "/opt/deepmd/deepmd-kit-222/bin/python"

forward_files = ['calypso_run_opt.py', 'calypso_check_outcar.py', 'input.dat', 'POSCAR']
backward_files = ['OUTCAR', 'CONTCAR', 'traj.traj', 'model_devi.out']
forward_common_files=['graph.pb']

command = f'{python} calypso_run_opt.py || {python} calypso_check_outcar.py > model_devi.out 2>&1'

machine = Machine.load_from_json('./machine.json')
resources = Resources.load_from_json('./resources.json')


os.system("./calypso.x")
for step in range(1, MaxStep+1):

    task_list = []

    for pop in range(1, PopSize + 1):
        task_dir = "data/step%03d.pop%03d"% (step,pop)
        Path(task_dir).mkdir(parents=True, exist_ok=True)
        shutil.copyfile("POSCAR_%d"%pop, os.path.join(task_dir, "POSCAR"))
        shutil.copyfile("POSCAR_%d"%pop, os.path.join(task_dir, "POSCAR.ori"))
        # dp
        for file_name in forward_files[:3]:
            shutil.copyfile(file_name , os.path.join(task_dir, file_name))

        task = Task(
            command=command,
            task_work_path=task_dir,
            forward_files=forward_files,
            backward_files=backward_files,
        )
        task_list.append(task)

    submission = Submission(
            work_base = os.getcwd(),
            machine= machine,
            resources =resources,
            task_list = task_list,
            forward_common_files=forward_common_files
            )
    submission.run_submission()

    for pop in range(1, PopSize + 1):
        task_dir = "data/step%03d.pop%03d"% (step,pop)
        shutil.copyfile(os.path.join(task_dir, "CONTCAR"), "CONTCAR_%d"%pop )
        shutil.copyfile(os.path.join(task_dir, "OUTCAR"), "OUTCAR_%d"%pop)

    os.system("./calypso.x")

代码
文本

evo.dispatcher.py脚本中,需要修改MaxStepPopSizepython变量:

  • MaxStepPopSize需要和input.dat中的参数保持一致
  • python需要指定deepmd-kit的python路径
代码
文本
[14]
cat split_calypso_dp_example/run.sh
#!/bin/bash
#SBATCH --job-name=split-mode
#SBATCH --mem-per-cpu=2gb
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=1
#SBATCH --output=%j.log
#SBATCH --partition=hebhcnormal01

# load the environment
module purge
# module load apps/PyTorch/chatglm2_py38/pytorch1.13-dtk23.04-py38

source /public/software/profile.d/compiler_intel-compiler-2017.5.239.sh 
source /public/software/profile.d/mpi_intelmpi-2018.4.274.sh 

# ./calypso.x > caly.log 2>&1
/public/share/acsp3lax5q/miniconda3/envs/workshop2023/bin/python evo.dispatcher.py > caly.log 2>&1

代码
文本

可通过提交run.sh脚本运行,也可以通过直接在命令行运行evo.dispatcher.py脚本运行。

代码
文本
dpgen
CALYPSO
结构搜索
dpgenCALYPSO结构搜索
已赞2
本文被以下合集收录
MD
bohr61096f
更新于 2024-08-27
47 篇0 人关注
DPMD
RamonMi
更新于 2024-07-29
14 篇0 人关注
推荐阅读
公开
快速上手使用ABACUS + DP-GEN | init_bulk/run
ABACUSABACUS使用教程dpgen
ABACUSABACUS使用教程dpgen
AISI 胡钰铂
更新于 2024-06-18
2 转存文件1 评论
公开
手搓RHF其实不难只是慢(1):Gaussian型轨道和计算重叠积分
量子化学
量子化学
hanyanbo
发布于 2023-09-25
3 赞3 转存文件