Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
DeePKS-Training-Demo
DeePKS
DeePKS
ou_qi@163.com
发布于 2023-08-24
推荐镜像 :abacus:3.1.4-toolkit-notebook-oss2
推荐机型 :c2_m4_cpu
赞 2
3
5
deepks-training-demo(v1)

DeePKS-Training-Demo

代码
文本

©️ Copyright 2023 @ Authors
Author: Qi Ou 📨
Date: 2023-08-23
Quick start: click on the Connect button located at the top of the interface, then select the abacus:3.1.4-toolkit-notebook-oss2 Image and choose c2_m4_cpu to proceed.

代码
文本

Goal

The goal of this notebook is to show a demo of the DeePKS iterative traning procedure. Single water molecule is applied as the test system, and the DeePKS model based on LDA functional is trained to achieve the accurace of PBE functional.

Catalogue

Background

The DeePKS model is trained based on an iterative training procedure that incorperates the self-consistent field (SCF) calculation performed in an ab initio software and the fitting process by the neural network performed in DeePKS-kit. For periodic systems, the corresponding SCF calculaiton with the DeePKS model is implemented in an open-source DFT package ABACUS. deepks_flowchart.png

代码
文本

Demo files

Now we will show a DeePKS training demo. The system applied in this demo is single water molecule, and the DeePKS model is trained based on LDA to achieve the accuracy of PBE. The required files can be obtained via

代码
文本
[28]
rm -r /data/water_single_lda2pbe_abacus
代码
文本
[30]
cp -r /bohr/deepks-training-demo-z7vu/v1/water_single_lda2pbe_abacus /data
代码
文本
[31]
cd /data/water_single_lda2pbe_abacus
/data/water_single_lda2pbe_abacus
代码
文本

This directory contains two subdirectories, i.e. systems and iter. systems contains the atomic coordinates (atom.npy), energy labels (energy.npy), and force labels (force.npy) of each water molecule configuration:

代码
文本
[32]
!tree systems
systems
├── group.00
│   ├── atom.npy
│   ├── energy.npy
│   └── force.npy
├── group.01
│   ├── atom.npy
│   ├── energy.npy
│   └── force.npy
├── group.02
│   ├── atom.npy
│   ├── energy.npy
│   └── force.npy
└── group.03
    ├── atom.npy
    ├── energy.npy
    └── force.npy

4 directories, 12 files
代码
文本

The number of configurations in group.00, group.01, group.02 is 300, while that in group.03 is 100. The first three groups are set as training set, while the last one is set as test set.

iter is the main working directory that includes all required numerical atomic orbitals, pseudopotentials, projector files, and input parameter files for the DeePKS training:

代码
文本
[33]
!tree iter
iter
├── H_ONCV_PBE-1.0.upf
├── H_gga_6au_60Ry_2s1p.orb
├── O_ONCV_PBE-1.0.upf
├── O_gga_6au_60Ry_2s2p1d.orb
├── jle.orb
├── machines.yaml
├── machines_dpdispatcher.yaml
├── params.yaml
├── run.sh
├── run_dpdispatcher.sh
├── scf_abacus.yaml
└── systems.yaml

0 directories, 12 files
代码
文本

Training the model

We now move to iter to start the training process:

代码
文本
[34]
cd iter
/data/water_single_lda2pbe_abacus/iter
代码
文本

and we need to fill out the Bohrium account information in machines_dpdispatcher.yaml:

代码
文本
[ ]
import yaml
import getpass
from monty.serialization import loadfn, dumpfn
file_path = '/data/water_single_lda2pbe_abacus/iter/machines_dpdispatcher.yaml'

with open(file_path) as f:
j = yaml.load(f.read(), Loader=yaml.FullLoader)

email = input('Please enter your Bohrium account: ')
password = getpass.getpass('Please enter your password: ')
program_id = int(input('Please enter your Bohrium Program ID: '))

deepks_steps = ['scf_machine', 'train_machine']
for i in deepks_steps:
j[i]['dpdispatcher_machine']['remote_profile']['email'] = email
j[i]['dpdispatcher_machine']['remote_profile']['password'] = password
j[i]['dpdispatcher_machine']['remote_profile']['program_id'] = program_id

with open(file_path, 'w') as f:
data = yaml.dump(j, f)
代码
文本

To trigger the training process, simply issue:

代码
文本
[38]
!bash run_dpdispatcher.sh
代码
文本

The error message (if any) will be collected in err.iter:

代码
文本
[40]
cat err.iter
代码
文本

Analyzing the result

The training procedure will run iteratively, and results of the first iteration will be collected in iter.init, which corresponds to the so-called DeePHF training. For each iteration, the SCF calculation results are included in 00.scf under iter.xx, and the training results are included in 01.train under iter.xx. To check the results of DeePHF

代码
文本
[16]
!tree iter.init/01.train
iter.init/01.train
├── 03a64652bceef7e5b17d2f837ab01b499a284c71.sub
├── 03a64652bceef7e5b17d2f837ab01b499a284c71_flag_if_job_task_fail
├── 03a64652bceef7e5b17d2f837ab01b499a284c71_job_tag_finished
├── STDOUTERR
├── backup
│   ├── 03a64652bceef7e5b17d2f837ab01b499a284c71.zip
│   └── 03a64652bceef7e5b17d2f837ab01b499a284c71_back.zip
├── d87afba9fc01b802763ef2b580fdcd7d37b59093_task_tag_finished
├── data_test -> ../00.scf/data_test
├── data_train -> ../00.scf/data_train
├── err
├── err.train
├── lbg-415-8485888.sh
├── log.test
├── log.train
├── model.pth
├── test.out
└── train_input.yaml -> ../../share/init_train.yaml

3 directories, 15 files
代码
文本

The learning curve and training errors are stored at log.train and log.test, respectively.

代码
文本
DeePKS
DeePKS
已赞2
本文被以下合集收录
DeePKS
bohr9d1500
更新于 2024-02-27
10 篇0 人关注
推荐阅读
公开
DeePKS4Perovskites
DeePKSEnglishABACUS
DeePKSEnglishABACUS
ou_qi@163.com
发布于 2023-08-22
2 转存文件
公开
Notebook 可视化-DPMD 训练损失图和学习率变化曲线
Notebook 可视化
Notebook 可视化
Hui_Zhou
发布于 2023-10-17
2 赞2 转存文件