Bohrium
robot
新建

空间站广场

论文
Notebooks
比赛
课程
Apps
我的主页
我的Notebooks
我的论文库
我的足迹

我的工作空间

任务
节点
文件
数据集
镜像
项目
数据库
公开
硫族化合物固态电解质通用力场横向测试
notebook
notebook
SchrodingersCat
更新于 2024-06-18
推荐镜像 :chgnet-m3gnet:0.0.3
推荐机型 :c2_m4_cpu
赞 2
1
LAM-SSE(v4)

硫族化合物固态电解质通用力场横向测试

代码
文本

1. 通用力场整体介绍 Overview of Universal Force Fields

代码
文本

在探索固态电解质的性能和稳定性时,准确的原子间势能模型(IAP)是至关重要的。硫族化合物作为固态电解质的一种重要类型,因其在能源存储和转换设备中的应用潜力而受到广泛关注。为了有效地模拟这些材料的性质和行为,需要一个能够广泛适用于硫族化合物的通用力场。

  • CHGNet (Crystal Hamiltonian Graph Neural Network)

CHGNet是一种基于图神经网络(GNN)的机器学习原子间势能模型(MLIP),由加州大学伯克利分校的研究团队开发。它专为解决原子建模中的电子相互作用问题而设计,特别是在处理含有异价离子的固态材料时。CHGNet在训练过程中明确包含了磁矩信息,这使得模型能够学习和准确地表示电子的轨道占据情况,从而增强了描述原子和电子自由度的能力。CHGNet通过结合磁矩作为电荷状态的约束,提供了对材料建模中额外电子自由度的洞察,这些自由度在以往的MLIPs中是无法观察到的。

  • M3GNet (Materials Graph Network with Many-Body Interactions):

M3GNet是由Materials Virtual Lab开发的基于图神经网络的通用原子间势能模型,旨在为周期表中的所有元素提供一种通用的IAP。M3GNet通过结合传统IAPs中的多体特征和灵活的图材料表示,使用Materials Project数据库中的大量结构弛豫数据进行训练。该模型能够准确预测能量、力和应力,并且在晶体结构的声子和弹性计算、结构弛豫等方面具有广泛的应用。M3GNet的一个关键特点是它的可扩展性,可以轻松地应用于多组分化学体系,而不需要重新训练。此外,M3GNet还能够作为其他结构探索技术(如进化算法和生成模型)的替代模型,用于生成更多样化和不受限制的候选结构。

尽管CHGNet和M3GNet都是在原子尺度模拟领域的重要进展,为材料科学研究提供了强大的工具,但它们各自也有一些局限性。例如,CHGNet在处理氧化态的表示时,依赖于预先定义的氧化态,这可能限制了它在处理复杂材料系统中的适用性。而M3GNet虽然在预测能量、力和应力方面表现出色,但其训练数据主要集中在优化轨迹和平衡结构上,可能无法充分捕捉到高温下材料动态行为的复杂性。

相比之下,DPA-2模型通过采用先进的深度学习技术,能够更全面地捕捉材料系统的多体相互作用和动态性质。DPA-2的设计理念是在保持计算效率的同时,提供与第一性原理计算相当的精度,这使得它在模拟长时间尺度和大规模系统的动态行为时具有显著优势。此外,DPA-2模型的泛化能力较强,能够适应更广泛的温度和压力条件,为研究者提供了一个更为灵活和全面的材料模拟平台。

总的来说,CHGNet和M3GNet在特定领域内展现了强大的性能,但在某些方面仍有提升空间。而DPA-2模型则在这些方面提供了改进的潜力,为未来的材料模拟研究开辟了新的可能性。随着模型的进一步发展和优化,我们期待DPA-2能够在材料科学和计算化学领域发挥更大的作用。


Accurate interatomic potential models (IAPs) are critical when exploring the performance and stability of solid-state electrolytes. Chalcogenides, as an important type of solid-state electrolytes, have attracted widespread attention due to their potential applications in energy storage and conversion devices. In order to effectively simulate the properties and behaviors of these materials, a general force field that can be widely applicable to chalcogenides is needed.

  • CHGNet

CHGNet is a machine learning interatomic potential model (MLIP) based on graph neural networks (GNNs), developed by a research team at the University of California, Berkeley. It is designed to address the problem of electron interactions in atomistic modeling, especially when dealing with solid-state materials containing heterovalent ions. CHGNet explicitly includes magnetic moment information during training, which enables the model to learn and accurately represent the orbital occupancy of electrons, thereby enhancing the ability to describe atomic and electronic degrees of freedom. By incorporating magnetic moments as constraints on charge states, CHGNet provides insights into additional electronic degrees of freedom in material modeling that are not observable in previous MLIPs.

  • M3GNet

M3GNet is a universal interatomic potential model based on graph neural networks developed by Materials Virtual Lab, aiming to provide a universal IAP for all elements in the periodic table. M3GNet combines the multi-body features in traditional IAPs with flexible graph material representations, and is trained using a large amount of structural relaxation data from the Materials Project database. The model can accurately predict energies, forces, and stresses, and has a wide range of applications in phonon and elastic calculations of crystal structures, structural relaxation, etc. A key feature of M3GNet is its scalability, which can be easily applied to multi-component chemical systems without retraining. In addition, M3GNet can also serve as an alternative model to other structural exploration techniques (such as evolutionary algorithms and generative models) for generating more diverse and unrestricted candidate structures.

Although both CHGNet and M3GNet are important advances in the field of atomic-scale simulations and provide powerful tools for materials science research, they each have some limitations. For example, CHGNet relies on pre-defined oxidation states when dealing with the representation of oxidation states, which may limit its applicability in dealing with complex material systems. While M3GNet performs well in predicting energy, force, and stress, its training data is mainly focused on optimizing trajectories and equilibrium structures, which may not fully capture the complexity of the dynamic behavior of materials at high temperatures.

In contrast, the DPA-2 model can more comprehensively capture the multi-body interactions and dynamic properties of material systems by adopting advanced deep learning techniques. The design concept of DPA-2 is to provide accuracy comparable to first-principles calculations while maintaining computational efficiency, which gives it a significant advantage in simulating the dynamic behavior of long time scales and large-scale systems. In addition, the DPA-2 model has strong generalization ability and can adapt to a wider range of temperature and pressure conditions, providing researchers with a more flexible and comprehensive material simulation platform.

In general, CHGNet and M3GNet have shown strong performance in specific areas, but there is still room for improvement in some aspects. The DPA-2 model provides the potential for improvement in these aspects, opening up new possibilities for future material simulation research. With further development and optimization of the model, we expect DPA-2 to play a greater role in materials science and computational chemistry.

  • 参考文献 References
  1. Chen, C., Ong, S.P. A universal graph deep learning interatomic potential for the periodic table. Nat Comput Sci 2, 718–728 (2022). https://doi.org/10.1038/s43588-022-00349-3
  2. Deng, B., Zhong, P., Jun, K. et al. CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nat Mach Intell 5, 1031–1041 (2023). https://doi.org/10.1038/s42256-023-00716-3
代码
文本

2. 通用力场精度评测 Precision Evaluation of Universal Force Fields

代码
文本

2.1 平衡态数据集 Near-Equilibrium Datasets

由于通用力场往往只有平衡态附近的结构作为初始训练集,为了保证测试的公开公正,增加了平衡态构型作为测试集进行比较。 直接采用MP的结构,采用PBE/PBEsol分别run 10 步AIMD (50 K) 得到6个体系(LXPS,LPSX)的测试集,分别采用DPA-2,CHGNET和M3GNET测试 DPMD相比于通用力场,能量精度高1到2个数量级,力精度高2到5倍

下表中平衡位置附近结构中,DPA-2与CHGNET和M3GNET的对比

CHGNET LGPS LPSBr LPSCl LPSI LSiPS LSnPS
Energy MAE (meV/atom) 240.7 220.2 233.1 201.1 232.2 244.8
Force MAE (meV/Å) 27.9 31.9 29.8 39.4 26.1 35.8
Energy RMSE (meV/atom) 240.7 220.2 233.1 201.1 232.2 244.8
Force RMSE (meV/Å) 28.1 55.2 51.6 68.2 26.4 39.0
M3GNET LGPS LPSBr LPSCl LPSI LSiPS LSnPS
Energy MAE (meV/atom) 4.9 2.5 14.5 0.6 7.9 3.5
Force MAE (meV/Å) 18.6 18.7 19.8 22.9 34.7 37
Energy RMSE (meV/atom) 4.9 2.5 14.5 0.6 7.9 3.5
Force RMSE (meV/Å) 23.9 32.3 34.3 39.7 39.3 38.2
DPA-2 LGPS LPSBr LPSCl LPSI LSiPS LSnPS
Energy MAE (meV/atom) 1.0 0.6 0.1 1.1 1.0 1.5
Force MAE (meV/Å) 5.8 4.9 4.1 5.3 6.9 9.2
Energy RMSE (meV/atom) 1.0 0.6 0.1 1.1 1.0 1.5
Force RMSE (meV/Å) 8.3 6.5 5.5 8.6 10.0 14.2

为了notebook的简洁,下面我们用Li-Ge-P-S体系进行展示通用力场的计算。

M3GNET和CHGNET的特点都是内部嵌入了ase的部分模块。而我们的数据集是deepmd/npy格式的,因此需要用dpdata读出并记录DFT的标签,然后转成ase的格式并且调用M3GNET/CHGNET进行能量、受力的预测


For the simplicity of the notebook, we will use the Li-Ge-P-S system to demonstrate the calculation of the general force field.

The characteristics of M3GNET and CHGNET are that some modules of ase are embedded internally. Our data set is in deepmd/npy format, so we need to use dpdata to read and record the DFT label, then convert it to ase format and call M3GNET/CHGNET to predict energy and force.

代码
文本
[4]
import os
os.chdir("/bohr/LAM-SSE-7xfp/v4/20240412-data-SSE/20240412-SSE-CHGNET-M3GNET/00.e-f/equi")
os.system("tree -L 1")

.
├── LGePS
├── chgnet_test.json
└── m3gnet_test.json

1 directory, 2 files
0
代码
文本

M3GNET

代码
文本
[7]
import dpdata
from ase import Atoms
from m3gnet.models import M3GNetCalculator, Potential, M3GNet
from ase import Atoms
#from ase.lattice.spacegroup import crystal
import glob
import json
import gc


# Initialize an M3GNet object via the following code
m3gnet_model = M3GNet.load()

# Initialize the Potential object with the created M3GNet model
potential = Potential(model=m3gnet_model)

# Initialize the M3GNetCalculator with the defined Potential
calculator = M3GNetCalculator(potential=potential)


def test_m3gnet(formula, atomic_positions, cell):
struct_data = Atoms(formula, positions=atomic_positions, cell=cell, pbc=True)

# Assign the calculator to the Atoms object
struct_data.set_calculator(calculator)

# Calculate properties
energy_result = struct_data.get_potential_energy()
forces_result = struct_data.get_forces()
stress_result = struct_data.get_stress()
print(energy_result, '\n', forces_result, '\n', stress_result)
return energy_result, forces_result, stress_result

import numpy as np

def convert_ndarray(obj):
if isinstance(obj, np.generic):
return obj.item()
elif isinstance(obj, np.ndarray):
return obj.tolist()
elif isinstance(obj, dict):
return {key: convert_ndarray(val) for key, val in obj.items()}
elif isinstance(obj, list):
return [convert_ndarray(item) for item in obj]
else:
return obj

def main():
results_dict = {}
datasets = glob.glob('L*')
for dataset in datasets:
results_dict[dataset] = []
dp = dpdata.LabeledSystem(dataset, fmt='deepmd/npy')
ase = dp.to_ase_structure()
frame_num = len(ase)
print(frame_num)
for j in range(frame_num):
atom_num = len(ase[j])
formula = ase[j].symbols
atom_coords = []
for i in range(atom_num):
atom_pos = ase[j][i].position
atom_coords.append(atom_pos)
cell = ase[j].cell
atomic_positions = atom_coords
energy_result, forces_result, stress_result = test_m3gnet(formula, atomic_positions, cell)
print(energy_result, forces_result, stress_result)
# Append results for each frame within the dataset
results_dict[dataset].append({
'frame': j,
'energy_result': energy_result,
'forces_result': forces_result if isinstance(forces_result, list) else forces_result.tolist(),
'stress_result': stress_result if isinstance(stress_result, list) else stress_result.tolist()
})

# Convert all ndarrays to lists in the results_dict
results_dict = convert_ndarray(results_dict)

# Write the dictionary to a JSON file
'''
with open('m3gnet_test.json', 'w') as f:
json.dump(results_dict, f, indent=4)'''
main()
已隐藏输出
代码
文本
[8]
import dpdata
import json
import glob
import numpy as np
import matplotlib.pyplot as plt
from itertools import chain
import pandas as pd

datasets = glob.glob('L*')
m3gnet_test_res = './m3gnet_test.json'
with open(m3gnet_test_res, 'r', encoding='utf-8') as file:
json_data = json.load(file)
energy_label_list = []
energy_m3gnet_list = []
forces_label_list = []
forces_m3gnet_list = []
energy_label_bias_list = []
energy_m3gnet_bias_list = []

results = []
for idx, dataset in enumerate(datasets):
num_frame = len(json_data[f"{dataset}"])
k = dpdata.LabeledSystem(dataset, fmt='deepmd/npy')
energy_label_list_sys = []
energy_m3gnet_list_sys = []
forces_label_list_sys = []
forces_m3gnet_list_sys = []
natoms = []
for frame_idx in range(num_frame):
atom_numbs = sum(k['atom_numbs'])
energy_m3gnet = json_data[f"{dataset}"][frame_idx]["energy_result"]/atom_numbs #eV/atom
forces_m3gnet = json_data[f"{dataset}"][frame_idx]["forces_result"]
stress_m3gnet = json_data[f"{dataset}"][frame_idx]["stress_result"]
energy_label = (k['energies'][frame_idx])/atom_numbs #eV/atom
forces_label = k['forces'][frame_idx]
stress_label = k['virials'][frame_idx]

energy_label_list_sys.append(energy_label)
energy_m3gnet_list_sys.append(energy_m3gnet)
forces_label_list_sys.append(forces_label)
forces_m3gnet_list_sys.append(forces_m3gnet)
natoms.append(atom_numbs)


energy_label_list.append(energy_label_list_sys)
energy_m3gnet_list.append(energy_m3gnet_list_sys)

# Calculating energy bias
natoms_array = np.array(natoms).reshape(-1, 1)
energy_label_array = np.array(energy_label_list_sys).reshape(-1, 1)
energy_m3gnet_array = np.array(energy_m3gnet_list_sys).reshape(-1, 1)
energy_label_bias, _, _, _ = np.linalg.lstsq(natoms_array, energy_label_array, rcond=None)
energy_m3gnet_bias, _, _, _ = np.linalg.lstsq(natoms_array, energy_m3gnet_array, rcond=None)
energy_label_bias_list.append(energy_label_bias)
energy_m3gnet_bias_list.append(energy_m3gnet_bias)


forces_label_list_sys_flattened = [element for sublist in forces_label_list_sys for element in sublist]
forces_m3gnet_list_sys_flattened = [element for sublist in forces_m3gnet_list_sys for element in sublist]
forces_label_list.append(forces_label_list_sys_flattened)
forces_m3gnet_list.append(forces_m3gnet_list_sys_flattened)

energy_array_label = np.array(energy_label_list[idx]) - energy_label_bias_list[idx] # considering energy bias
energy_array_m3gnet = np.array(energy_m3gnet_list[idx]) - energy_m3gnet_bias_list[idx] # considering energy bias
forces_array_label = np.array(forces_label_list_sys_flattened[idx])
forces_array_m3gnet = np.array(forces_m3gnet_list_sys_flattened[idx])


# Calculate RMSEs
rmse_e = np.sqrt(np.mean((energy_array_label - energy_array_m3gnet) ** 2))
rmse_f = np.sqrt(np.mean((forces_array_label - forces_array_m3gnet) ** 2))
#print(f"{dataset} (rmse_e/rmse_f):\t{rmse_e:.4f}\t{rmse_f:.4f}\n")
# Calculate MAEs
mae_e = np.mean(np.abs(energy_array_label - energy_array_m3gnet))
mae_f = np.mean(np.abs(forces_array_label - forces_array_m3gnet))
#print(f"{dataset} (mae_e/mae_f):\t{mae_e:.4f}\t{mae_f:.4f}\n")

results.append({
"Dataset": dataset,
"MAE (Energy)[meV/atom]": f"{1000*mae_e:.1f}",
"MAE (Force)[meV/ang.]": f"{1000*mae_f:.1f}",
"RMSE (Energy)[meV/atom]": f"{1000*rmse_e:.1f}",
"RMSE (Force)[meV/ang.]": f"{1000*rmse_f:.1f}"
})

#---------------------

# Create a pandas DataFrame
df_results = pd.DataFrame(results)

# Convert DataFrame to markdown-formatted string
markdown_table = df_results.to_markdown(index=False)

# To print or write the markdown table to a file, depending on your needs
print(markdown_table)

| Dataset   |   MAE (Energy)[meV/atom] |   MAE (Force)[meV/ang.] |   RMSE (Energy)[meV/atom] |   RMSE (Force)[meV/ang.] |
|:----------|-------------------------:|------------------------:|--------------------------:|-------------------------:|
| LGePS     |                      4.9 |                    18.6 |                       4.9 |                     23.9 |
代码
文本

CHGNET

代码
文本
[10]


from chgnet.model.model import CHGNet
from pymatgen.core import Structure
from pymatgen.io.ase import AseAtomsAdaptor
import dpdata
from ase import Atoms
#from ase.lattice.spacegroup import crystal
import glob
import json
import numpy as np

def test_chgnet(formula, atomic_positions, cell):
chgnet = CHGNet.load()
structure = Atoms(formula, positions=atomic_positions, cell=cell, pbc=True)
pymatgen_structure = AseAtomsAdaptor.get_structure(structure)
prediction = chgnet.predict_structure(pymatgen_structure)
predicted_energy = prediction['e']
predicted_forces = prediction['f']
predicted_stress = prediction['s']

print(predicted_energy, predicted_forces, predicted_stress)
return predicted_energy, predicted_forces, predicted_stress


import numpy as np

def convert_ndarray(obj):
if isinstance(obj, np.ndarray):
return obj.tolist()
elif isinstance(obj, dict):
return {key: convert_ndarray(val) for key, val in obj.items()}
elif isinstance(obj, list):
return [convert_ndarray(item) for item in obj]
else:
return obj

def main():
results_dict = {}
datasets = glob.glob('L*')
for dataset in datasets:
results_dict[dataset] = []
dp = dpdata.LabeledSystem(dataset, fmt='deepmd/npy')
ase = dp.to_ase_structure()
frame_num = len(ase)
print(frame_num)
for j in range(frame_num):
atom_num = len(ase[j])
formula = ase[j].symbols
atom_coords = []
for i in range(atom_num):
atom_pos = ase[j][i].position
atom_coords.append(atom_pos)
cell = ase[j].cell
atomic_positions = atom_coords
energy_result, forces_result, stress_result = test_chgnet(formula, atomic_positions, cell)
print(energy_result, forces_result, stress_result)
# Append results for each frame within the dataset
results_dict[dataset].append({
'frame': j,
'energy_result': energy_result,
'forces_result': forces_result if isinstance(forces_result, list) else forces_result.tolist(),
'stress_result': stress_result if isinstance(stress_result, list) else stress_result.tolist()
})

# Convert all ndarrays to lists in the results_dict
results_dict = convert_ndarray(results_dict)
'''
# Write the dictionary to a JSON file
with open('chgnet_test.json', 'w') as f:
json.dump(results_dict, f, indent=4)'''

main()

已隐藏输出
代码
文本
[11]
import dpdata
import json
import glob
import numpy as np
import matplotlib.pyplot as plt
from itertools import chain
import pandas as pd

datasets = glob.glob(f'L*')
chgnet_test_res = './chgnet_test.json'
with open(chgnet_test_res, 'r', encoding='utf-8') as file:
json_data = json.load(file)
energy_label_list = []
energy_chgnet_list = []
forces_label_list = []
forces_chgnet_list = []
energy_label_bias_list = []
energy_chgnet_bias_list = []

results = []

for idx, dataset in enumerate(datasets):
num_frame = len(json_data[f"{dataset}"])
k = dpdata.LabeledSystem(dataset, fmt='deepmd/npy')
energy_label_list_sys = []
energy_chgnet_list_sys = []
forces_label_list_sys = []
forces_chgnet_list_sys = []
natoms = []
for frame_idx in range(num_frame):
energy_chgnet = json_data[f"{dataset}"][frame_idx]["energy_result"] #eV/atom
forces_chgnet = json_data[f"{dataset}"][frame_idx]["forces_result"]
stress_chgnet = json_data[f"{dataset}"][frame_idx]["stress_result"]

atom_numbs = sum(k['atom_numbs'])
energy_label = (k['energies'][frame_idx])/atom_numbs #eV/atom
forces_label = k['forces'][frame_idx]
stress_label = k['virials'][frame_idx]

energy_label_list_sys.append(energy_label)
energy_chgnet_list_sys.append(energy_chgnet)
forces_label_list_sys.append(forces_label)
forces_chgnet_list_sys.append(forces_chgnet)
natoms.append(atom_numbs)


energy_label_list.append(energy_label_list_sys)
energy_chgnet_list.append(energy_chgnet_list_sys)

# Calculating energy bias
natoms_array = np.array(natoms).reshape(-1, 1)
energy_label_array = np.array(energy_label_list_sys).reshape(-1, 1)
energy_chgnet_array = np.array(energy_chgnet_list_sys).reshape(-1, 1)
energy_label_bias, _, _, _ = np.linalg.lstsq(natoms_array, energy_label_array, rcond=None)
energy_chgnet_bias, _, _, _ = np.linalg.lstsq(natoms_array, energy_chgnet_array, rcond=None)
energy_label_bias_list.append(energy_label_bias)
energy_chgnet_bias_list.append(energy_chgnet_bias)

forces_label_list_sys_flattened = [element for sublist in forces_label_list_sys for element in sublist]
forces_chgnet_list_sys_flattened = [element for sublist in forces_chgnet_list_sys for element in sublist]
forces_label_list.append(forces_label_list_sys_flattened)
forces_chgnet_list.append(forces_chgnet_list_sys_flattened)

energy_array_label = np.array(energy_label_list[idx]) - energy_label_bias_list[idx] # considering energy bias
energy_array_chgnet = np.array(energy_chgnet_list[idx]) - energy_chgnet_bias_list[idx] # considering energy bias
forces_array_label = np.array(forces_label_list_sys_flattened[idx])
forces_array_chgnet = np.array(forces_chgnet_list_sys_flattened[idx])

# Calculate RMSEs
rmse_e = np.sqrt(np.mean((energy_array_label - energy_array_chgnet) ** 2))
rmse_f = np.sqrt(np.mean((forces_array_label - forces_array_chgnet) ** 2))
#print(f"{dataset} (rmse_e/rmse_f):\t{1000*rmse_e:.4f}meV/atom\t{1000*rmse_f:.4f}meV/ang.\n")

# Calculate MAEs
mae_e = np.mean(np.abs(energy_array_label - energy_array_chgnet))
mae_f = np.mean(np.abs(forces_array_label - forces_array_chgnet))
#print(f"{dataset} (mae_e/mae_f):\t{1000*mae_e:.4f}meV/atom\t{1000*mae_f:.4f}meV/ang.\n")

results.append({
"Dataset": dataset,
"MAE (Energy)[meV/atom]": f"{1000*mae_e:.1f}",
"MAE (Force)[meV/ang.]": f"{1000*mae_f:.1f}",
"RMSE (Energy)[meV/atom]": f"{1000*rmse_e:.1f}",
"RMSE (Force)[meV/ang.]": f"{1000*rmse_f:.1f}"
})

#---------------------


# Create a pandas DataFrame
df_results = pd.DataFrame(results)

# Convert DataFrame to markdown-formatted string
markdown_table = df_results.to_markdown(index=False)

# To print or write the markdown table to a file, depending on your needs
print(markdown_table)

| Dataset   |   MAE (Energy)[meV/atom] |   MAE (Force)[meV/ang.] |   RMSE (Energy)[meV/atom] |   RMSE (Force)[meV/ang.] |
|:----------|-------------------------:|------------------------:|--------------------------:|-------------------------:|
| LGePS     |                    240.7 |                    27.9 |                     240.7 |                     28.1 |
代码
文本

2.2 DPMD升温轨迹数据集 Heating Trajectories form DPMD

在DPMD的升温轨迹(150~1150 K)中挑选100帧结构,进行单点计算(PBE-sol/PBE),得到6个体系(LXPS,LPSX)的测试集,进行测试。


100 frames were selected from the DPMD heating trajectory (150-1150 K) for single-point calculations (PBE-sol for DP and PBE for CHGNET/M3GNET) to obtain a test set of 6 systems (LXPS, LPSX) for testing.

CHGNET LGPS LPSBr LPSCl LPSI LSiPS LSnPS
Energy MAE (meV/atom) 288.2 261.9 270.9 242.2 277.5 288.7
Force MAE (meV/Å) 118.6 105.3 23.8 119.3 44.6 53.3
Energy RMSE (meV/atom) 288.9 262.6 271.5 243.1 278.1 289.3
Force RMSE (meV/Å) 133.1 105.6 31.1 141 52.5 59.5
M3GNET LGPS LPSBr LPSCl LPSI LSiPS LSnPS
Energy MAE (meV/atom) 48.6 34.7 33.9 30.4 40.4 52.7
Force MAE (meV/Å) 83.1 109.4 44.5 115.7 38.4 65.9
Energy RMSE (meV/atom) 53.5 41.3 39.5 37.3 45.6 57.2
Force RMSE (meV/Å) 95.6 109.5 45.4 135.2 53.8 85.7
DPA-2 LGPS LPSBr LPSCl LPSI LSiPS LSnPS
Energy MAE (meV/atom) 1.6 0.9 1.6 0.7 1.4 2.7
Force MAE (meV/Å) 27.8 27.9 30.3 24.2 27.1 32.2
Energy RMSE (meV/atom) 1.9 1.2 1.8 0.8 1.6 3
Force RMSE (meV/Å) 38 37.6 41.1 32.9 36.9 43.9
代码
文本
[12]
import os
os.chdir("/bohr/LAM-SSE-7xfp/v4/20240412-data-SSE/20240412-SSE-CHGNET-M3GNET/00.e-f/heating")
os.system("tree -L 1")
.
├── LGPS
├── chgnet_test.json
└── m3gnet_test.json

1 directory, 2 files
0
代码
文本

M3GNET

代码
文本
[13]
import dpdata
import json
import glob
import numpy as np
import matplotlib.pyplot as plt
from itertools import chain
import pandas as pd

datasets = glob.glob('L*')
m3gnet_test_res = './m3gnet_test.json'
with open(m3gnet_test_res, 'r', encoding='utf-8') as file:
json_data = json.load(file)
energy_label_list = []
energy_m3gnet_list = []
forces_label_list = []
forces_m3gnet_list = []
energy_label_bias_list = []
energy_m3gnet_bias_list = []

results = []
for idx, dataset in enumerate(datasets):
num_frame = len(json_data[f"{dataset}"])
k = dpdata.LabeledSystem(dataset, fmt='deepmd/npy')
energy_label_list_sys = []
energy_m3gnet_list_sys = []
forces_label_list_sys = []
forces_m3gnet_list_sys = []
natoms = []
for frame_idx in range(num_frame):
atom_numbs = sum(k['atom_numbs'])
energy_m3gnet = json_data[f"{dataset}"][frame_idx]["energy_result"]/atom_numbs #eV/atom
forces_m3gnet = json_data[f"{dataset}"][frame_idx]["forces_result"]
stress_m3gnet = json_data[f"{dataset}"][frame_idx]["stress_result"]
energy_label = (k['energies'][frame_idx])/atom_numbs #eV/atom
forces_label = k['forces'][frame_idx]
stress_label = k['virials'][frame_idx]

energy_label_list_sys.append(energy_label)
energy_m3gnet_list_sys.append(energy_m3gnet)
forces_label_list_sys.append(forces_label)
forces_m3gnet_list_sys.append(forces_m3gnet)
natoms.append(atom_numbs)


energy_label_list.append(energy_label_list_sys)
energy_m3gnet_list.append(energy_m3gnet_list_sys)

# Calculating energy bias
natoms_array = np.array(natoms).reshape(-1, 1)
energy_label_array = np.array(energy_label_list_sys).reshape(-1, 1)
energy_m3gnet_array = np.array(energy_m3gnet_list_sys).reshape(-1, 1)
energy_label_bias, _, _, _ = np.linalg.lstsq(natoms_array, energy_label_array, rcond=None)
energy_m3gnet_bias, _, _, _ = np.linalg.lstsq(natoms_array, energy_m3gnet_array, rcond=None)
energy_label_bias_list.append(energy_label_bias)
energy_m3gnet_bias_list.append(energy_m3gnet_bias)


forces_label_list_sys_flattened = [element for sublist in forces_label_list_sys for element in sublist]
forces_m3gnet_list_sys_flattened = [element for sublist in forces_m3gnet_list_sys for element in sublist]
forces_label_list.append(forces_label_list_sys_flattened)
forces_m3gnet_list.append(forces_m3gnet_list_sys_flattened)

energy_array_label = np.array(energy_label_list[idx]) - energy_label_bias_list[idx] # considering energy bias
energy_array_m3gnet = np.array(energy_m3gnet_list[idx]) - energy_m3gnet_bias_list[idx] # considering energy bias
forces_array_label = np.array(forces_label_list_sys_flattened[idx])
forces_array_m3gnet = np.array(forces_m3gnet_list_sys_flattened[idx])


# Calculate RMSEs
rmse_e = np.sqrt(np.mean((energy_array_label - energy_array_m3gnet) ** 2))
rmse_f = np.sqrt(np.mean((forces_array_label - forces_array_m3gnet) ** 2))
#print(f"{dataset} (rmse_e/rmse_f):\t{rmse_e:.4f}\t{rmse_f:.4f}\n")
# Calculate MAEs
mae_e = np.mean(np.abs(energy_array_label - energy_array_m3gnet))
mae_f = np.mean(np.abs(forces_array_label - forces_array_m3gnet))
#print(f"{dataset} (mae_e/mae_f):\t{mae_e:.4f}\t{mae_f:.4f}\n")

results.append({
"Dataset": dataset,
"MAE (Energy)[meV/atom]": f"{1000*mae_e:.1f}",
"MAE (Force)[meV/ang.]": f"{1000*mae_f:.1f}",
"RMSE (Energy)[meV/atom]": f"{1000*rmse_e:.1f}",
"RMSE (Force)[meV/ang.]": f"{1000*rmse_f:.1f}"
})

#---------------------

# Create a pandas DataFrame
df_results = pd.DataFrame(results)

# Convert DataFrame to markdown-formatted string
markdown_table = df_results.to_markdown(index=False)

# To print or write the markdown table to a file, depending on your needs
print(markdown_table)

| Dataset   |   MAE (Energy)[meV/atom] |   MAE (Force)[meV/ang.] |   RMSE (Energy)[meV/atom] |   RMSE (Force)[meV/ang.] |
|:----------|-------------------------:|------------------------:|--------------------------:|-------------------------:|
| LGPS      |                     48.6 |                    83.1 |                      53.5 |                     95.6 |
代码
文本

CHGNET

代码
文本
[14]
import dpdata
import json
import glob
import numpy as np
import matplotlib.pyplot as plt
from itertools import chain
import pandas as pd

datasets = glob.glob(f'L*')
chgnet_test_res = './chgnet_test.json'
with open(chgnet_test_res, 'r', encoding='utf-8') as file:
json_data = json.load(file)
energy_label_list = []
energy_chgnet_list = []
forces_label_list = []
forces_chgnet_list = []
energy_label_bias_list = []
energy_chgnet_bias_list = []

results = []

for idx, dataset in enumerate(datasets):
num_frame = len(json_data[f"{dataset}"])
k = dpdata.LabeledSystem(dataset, fmt='deepmd/npy')
energy_label_list_sys = []
energy_chgnet_list_sys = []
forces_label_list_sys = []
forces_chgnet_list_sys = []
natoms = []
for frame_idx in range(num_frame):
energy_chgnet = json_data[f"{dataset}"][frame_idx]["energy_result"] #eV/atom
forces_chgnet = json_data[f"{dataset}"][frame_idx]["forces_result"]
stress_chgnet = json_data[f"{dataset}"][frame_idx]["stress_result"]

atom_numbs = sum(k['atom_numbs'])
energy_label = (k['energies'][frame_idx])/atom_numbs #eV/atom
forces_label = k['forces'][frame_idx]
stress_label = k['virials'][frame_idx]

energy_label_list_sys.append(energy_label)
energy_chgnet_list_sys.append(energy_chgnet)
forces_label_list_sys.append(forces_label)
forces_chgnet_list_sys.append(forces_chgnet)
natoms.append(atom_numbs)


energy_label_list.append(energy_label_list_sys)
energy_chgnet_list.append(energy_chgnet_list_sys)

# Calculating energy bias
natoms_array = np.array(natoms).reshape(-1, 1)
energy_label_array = np.array(energy_label_list_sys).reshape(-1, 1)
energy_chgnet_array = np.array(energy_chgnet_list_sys).reshape(-1, 1)
energy_label_bias, _, _, _ = np.linalg.lstsq(natoms_array, energy_label_array, rcond=None)
energy_chgnet_bias, _, _, _ = np.linalg.lstsq(natoms_array, energy_chgnet_array, rcond=None)
energy_label_bias_list.append(energy_label_bias)
energy_chgnet_bias_list.append(energy_chgnet_bias)

forces_label_list_sys_flattened = [element for sublist in forces_label_list_sys for element in sublist]
forces_chgnet_list_sys_flattened = [element for sublist in forces_chgnet_list_sys for element in sublist]
forces_label_list.append(forces_label_list_sys_flattened)
forces_chgnet_list.append(forces_chgnet_list_sys_flattened)

energy_array_label = np.array(energy_label_list[idx]) - energy_label_bias_list[idx] # considering energy bias
energy_array_chgnet = np.array(energy_chgnet_list[idx]) - energy_chgnet_bias_list[idx] # considering energy bias
forces_array_label = np.array(forces_label_list_sys_flattened[idx])
forces_array_chgnet = np.array(forces_chgnet_list_sys_flattened[idx])

# Calculate RMSEs
rmse_e = np.sqrt(np.mean((energy_array_label - energy_array_chgnet) ** 2))
rmse_f = np.sqrt(np.mean((forces_array_label - forces_array_chgnet) ** 2))
#print(f"{dataset} (rmse_e/rmse_f):\t{1000*rmse_e:.4f}meV/atom\t{1000*rmse_f:.4f}meV/ang.\n")

# Calculate MAEs
mae_e = np.mean(np.abs(energy_array_label - energy_array_chgnet))
mae_f = np.mean(np.abs(forces_array_label - forces_array_chgnet))
#print(f"{dataset} (mae_e/mae_f):\t{1000*mae_e:.4f}meV/atom\t{1000*mae_f:.4f}meV/ang.\n")

results.append({
"Dataset": dataset,
"MAE (Energy)[meV/atom]": f"{1000*mae_e:.1f}",
"MAE (Force)[meV/ang.]": f"{1000*mae_f:.1f}",
"RMSE (Energy)[meV/atom]": f"{1000*rmse_e:.1f}",
"RMSE (Force)[meV/ang.]": f"{1000*rmse_f:.1f}"
})

#---------------------


# Create a pandas DataFrame
df_results = pd.DataFrame(results)

# Convert DataFrame to markdown-formatted string
markdown_table = df_results.to_markdown(index=False)

# To print or write the markdown table to a file, depending on your needs
print(markdown_table)

| Dataset   |   MAE (Energy)[meV/atom] |   MAE (Force)[meV/ang.] |   RMSE (Energy)[meV/atom] |   RMSE (Force)[meV/ang.] |
|:----------|-------------------------:|------------------------:|--------------------------:|-------------------------:|
| LGPS      |                    288.2 |                   118.6 |                     288.9 |                    133.1 |
代码
文本

3. 扩散行为计算 Calculation of Diffusion Performance

3.1 基本原理和计算过程 Basic Principles and Calculation Procedure

(1) 均方根位移MSD:计算t时间内Li离子移动距离平方的平均 (2) 扩散系数D:求解t时间内的MSD在每个维度上的平均 (3) 离子电导率:根据Nernst-Einstein Equation计算 (4) 扩散系数和温度之间满足阿伦尼乌斯关系: (5) 因此,离子电导率和温度之间满足:

在研究LXPS体系的扩散系数时,我们将通用力场与DPA-2和AIMD的结果进行了比较。结果表明,使用DPA势函数计算得到的扩散系数与AIMD的结果具有很好的一致性。在室温条件下,实验测得的锂离子扩散系数通常在的数量级。相比之下,通用力场往往会高估锂离子的扩散系数,误差达到2到3个数量级,因此无法实现准确的预测。


(1) Root mean square displacement MSD: calculate the average of the square of the distance moved by Li ions in time t (2) Diffusion coefficient D: solve the average of MSD in each dimension in time t (3) Ionic conductivity : calculated according to Nernst-Einstein Equation (4) The diffusion coefficient and temperature satisfy the Arrhenius relationship: (5) Therefore, the relationship between ionic conductivity and temperature satisfies:

When studying the diffusion coefficient of the LXPS system, we compared the results of the universal force field with those of DPA-2 and AIMD. The results show that the diffusion coefficient calculated using the DPA potential function is in good agreement with the results of AIMD. At room temperature, the experimentally measured diffusion coefficient of lithium ions is usually on the order of . In contrast, the universal force field tends to overestimate the diffusion coefficient of lithium ions by 2 to 3 orders of magnitude, making it impossible to achieve accurate predictions.

代码
文本

3.2 分子动力学模拟 MD Simulation

如前所述,正因为M3GNET和CHGNET的特点都是内部嵌入了ase的部分模块,所以主体的代码结构非常相似,只不过调用模型的具体方式有所不同。我们已经在能量、受力精度评测的部分展示完整的计算代码,为了节省篇幅,下面分子动力学部分将只展示CHGNET在NVT系综下,控温于500K的计算.


As mentioned above, because both M3GNET and CHGNET have some modules of ase embedded inside, the main code structure is very similar, but the specific way of calling the model is different. We have shown the complete calculation code in the energy and force accuracy evaluation part. In order to save space, the following molecular dynamics part will only show the calculation of CHGNET under NVT ensemble and temperature control at 500K.

代码
文本
[23]
import os
os.chdir("/bohr/LAM-SSE-7xfp/v4/20240412-data-SSE/20240412-SSE-CHGNET-M3GNET/01.diffusion/")
os.system("tree -L 3")
.
├── LXPS-structure
│   └── LGPS
│       └── CONTCAR
├── chgnet
│   └── LGPS_order
│       └── 500k
├── chgnet_merged.json
├── m3gnet
│   └── LGPS_order
│       └── 500k
├── m3gnet_merged.json
└── summary.txt

8 directories, 4 files
0
代码
文本

CHGNET

代码
文本
[30]
import os
os.chdir("/bohr/LAM-SSE-7xfp/v4/20240412-data-SSE/20240412-SSE-CHGNET-M3GNET/01.diffusion/chgnet")
代码
文本

../LXPS-structure中存储了Li-Ge-P-S的POSCAR格式文件;

下面的代码是将../LXPS-structure中的结构文件复制进文件夹中.


In ../LXPS-structure stores the POSCAR format file of Li-Ge-P-S;

The following code is to copy the structure file in ../LXPS-structure into the folder.

代码
文本
[ ]
import os
import shutil

folders = [folder for folder in os.listdir('.') if os.path.isdir(folder)]

for folder in folders:
parts = folder.split('_')
if len(parts) == 3:
src_file = f'../LXPS-structure/{parts[0]}/disorder_CONTCAR{parts[2]}'
if os.path.exists(src_file):
shutil.copy(src_file, folder)
elif len(parts) == 2 and parts[1].lower() == 'order':
src_file = f'../LXPS-structure/{parts[0]}/CONTCAR'
if os.path.exists(src_file):
shutil.copy(src_file, folder)
代码
文本

批量复制、修改输入文件


The following code is to batch copy and modify the input file.

代码
文本
[ ]
import os
import shutil
import json
import subprocess

def execute_script_in_all_folders(script_name, target_folders):
for folder in target_folders:
subprocess.run(['python', script_name], cwd=folder)

# Get all folders in the current directory
folders = [f.path for f in os.scandir() if f.is_dir()]

# List of files to copy
files_to_copy = [f for f in os.listdir() if f.endswith('.json') or f.endswith('.py')]

# Temperature list
temp_list = [500] ## FOR DEMONSTRATION, ONLY 500K IS LISTED

# Copy *.json and *.py files to all folders and perform modifications
for folder in folders:
for file_name in files_to_copy:
shutil.copy(file_name, folder)

# Special handling for 'job.json'
if file_name == 'job.json':
with open(os.path.join(folder, file_name), 'r') as f:
job_data = json.load(f)

for temp in temp_list:
new_file_name = f'job_{temp}.json'
job_data['job_name'] = f'chgnet_{temp}_{os.path.basename(folder)}'
job_data['command'] = f'python chgnet_md_{temp}.py'
job_data['result'] = f'./{temp}k'

with open(os.path.join(folder, new_file_name), 'w') as new_file:
json.dump(job_data, new_file, indent=4)

# Special handling for 'chgnet_md.py'
elif file_name == 'chgnet_md.py':
for temp in temp_list:
new_file_name = f'chgnet_md_{temp}.py'
new_file_path = os.path.join(folder, new_file_name)
with open(file_name, 'r') as f:
lines = f.readlines()
with open(new_file_path, 'w') as new_file:
for line in lines:
if 'temp = 500' in line:
line = f'temp = {temp}\n'
new_file.write(line)

# Execute 'poscar2cif.py' in each folder
execute_script_in_all_folders('poscar2cif.py', folders)

print("File copying, modification, and script execution completed.")
代码
文本

下面将POSCAR转成带有氧化数信息的cif文件。


Next, convert POSCAR into a cif file with oxidation number information.

代码
文本
[ ]
%%file poscar2cif.py
from pymatgen.io.vasp import Poscar
from pymatgen.io.cif import CifWriter
import glob


oxidation_states = {'Li': 1, 'Ge': 4, 'Sn': 4, 'Si': 4, 'P': 5, 'S': -2}

poscar = Poscar.from_file(glob.glob("*CAR*")[0])
structure = poscar.structure

for element, oxidation_state in oxidation_states.items():
for site in structure:
if site.specie.symbol == element:
site.specie.oxi_state = oxidation_state


cif_writer = CifWriter(structure, symprec=None)
cif_writer.write_file('structure.cif')
代码
文本

下面是lbg的模板json文件。


The following is the template json of lbg submission.

代码
文本
[ ]
%%file job.json
{
"job_name": "chgnet_500k",
"command": "python chgnet_md.py",
"log_file": "log*",
"backward_files": [ ],
"project_id": <ID>,
"platform": "ali",
"machine_type": "c8_m32_1 * NVIDIA V100",
"job_type": "container",
"image_address": "registry.dp.tech/dptech/prod-19853/chgnet-m3gnet:0.0.3",
"result": "./500k"
}

代码
文本

下面是CHGNET的分子动力学参数模板脚本.


Below is the molecular dynamics parameter template script for CHGNET.

代码
文本
[ ]
%%file chgnet_md.py

from chgnet.model.model import CHGNet
from chgnet.model.dynamics import MolecularDynamics
from pymatgen.core import Structure
import warnings
warnings.filterwarnings("ignore", module="pymatgen")
warnings.filterwarnings("ignore", module="ase")

structure = Structure.from_file("structure.cif")
chgnet = CHGNet.load()

# Expand the structure by a 2x2x2 supercell
supercell_matrix = [2, 2, 2] # This defines the size of the supercell
structure.make_supercell(supercell_matrix)

timestep = 2 # fs
dumpfreq = 100 # step
temp = 500 # Kelvin
press = 0.0001 # GPa

md = MolecularDynamics(
atoms=structure,
model=chgnet,
ensemble="nvt",
temperature=temp,
timestep=timestep, # in femto-seconds
trajectory=f"chgnet_{temp}_nvt.traj",
logfile=f"chgnet_{temp}_nvt.log",
loginterval=dumpfreq,
)
md.run(1e5)
代码
文本

批量提交任务

代码
文本
[ ]
import os
import subprocess
import glob


current_directory = os.getcwd()

for item in os.listdir(current_directory):
path = os.path.join(current_directory, item)
if os.path.isdir(path):
os.chdir(path)
json_files = glob.glob('job_*.json')
for json_file in json_files:
cmd = f'lbg job submit -i {json_file} -p ./ '
process = subprocess.Popen(cmd, shell=True)
process.wait()

os.chdir(current_directory)
代码
文本

M3GNET

用M3GNET的计算流程与CHGNET十分相似,只不过调用模型的代码有所区别,因此下面只展示m3gnet_md.py的代码。


The calculation procedure of M3GNET is very similar to that of CHGNET, except that the code for calling the model is different, so only the code of m3gnet_md.py is shown below.

代码
文本
[42]
os.chdir("/bohr/LAM-SSE-7xfp/v4/20240412-data-SSE/20240412-SSE-CHGNET-M3GNET/01.diffusion/m3gnet")
代码
文本
[ ]
%%file m3gnet_md.py

from pymatgen.core import Structure, Lattice
from m3gnet.models import MolecularDynamics

structure = Structure.from_file("structure.cif")
# Expand the structure by a 2x2x2 supercell
supercell_matrix = [2, 2, 2] # This defines the size of the supercell
structure.make_supercell(supercell_matrix)



timestep = 2 # fs
dumpfreq = 100 # step
temp = 500 # Kelvin
press = 0.0001 # GPa


md = MolecularDynamics(
atoms=structure,
ensemble="nvt",
temperature=temp,
timestep=timestep, # in femto-seconds
trajectory=f"m3gnet_{temp}_nvt.traj",
logfile=f"m3gnet_{temp}_nvt.log",
loginterval=dumpfreq,
)
md.run(1e5)
代码
文本

3.2 离子扩散系数 Ionic Diffusion Coefficient

代码
文本
[33]

import numpy as np
import glob
from ase.io import read
import matplotlib.pyplot as plt
import pandas as pd
import json


# Constants
time_step = 2e-15 # Time step in seconds (1 fs)
dump_freq = 100 # Frequency at which data is dumped
angstrom_to_meter = 1e-10 # Conversion from Å to meters
window_size = 1 # Every 100 steps

# Read trajectory file
trajs = glob.glob("*/*/*nvt.traj")
print(trajs)

# Create an empty DataFrame to hold the results
data_for_df = []

# Initialize the result nested dictionary
results = {}
for traj in trajs:
disorder = traj.split('/',3)[0]
temperature = int(traj.split('/',3)[1][:-1])
sys_name = disorder.split('_',3)[0]
traj = read(traj, index=":")
n = len(traj)
n1 = int(0.3*n)
n2 = int(0.9*n)
# Get chemical symbols for all atoms
symbols = traj[0].get_chemical_symbols()

## Identify unique elements
unique_elements = set(symbols)

# Initialize a dictionary to store MSD values
msd_values = {element: [] for element in unique_elements}

# Initialize a dictionary to store time points
time_points = {element: [] for element in unique_elements}

# Compute the MSD for each element
for element in unique_elements:
indices = [i for i, symbol in enumerate(symbols) if symbol == element]

# Calculate displacements with respect to the first frame of the entire simulation
initial_positions = traj[0].get_positions(wrap=False)[indices]

for i in range(1, n):
current_positions = traj[i].get_positions(wrap=False)[indices]
displacement = current_positions - initial_positions
# Check for periodic boundary conditions
cell = traj[i].cell.array
# displacement -= np.dot(np.round(displacement @ np.linalg.inv(cell)), cell)
msd = np.mean(np.sum(displacement ** 2, axis=1))
msd_values[element].append(msd)
time_points[element].append(i * time_step * dump_freq)

# Fit the MSD versus time for each element and calculate the diffusion coefficient
diffusion_coefficients = {}
for element, msds in msd_values.items():
times = np.array(time_points[element])
slope, intercept = np.polyfit(times[n1:n2], msds[n1:n2], 1)
diffusion_coefficient = slope * angstrom_to_meter**2 / 6 # Divide slope by 6 for 3D
diffusion_coefficients[element] = diffusion_coefficient
plt.plot(times, msds, label=element)

plt.xlabel('Time (s)')
plt.ylabel('MSD (Ų)')
plt.legend()
plt.show()

# Print the diffusion coefficients in m²/s for each element
for element, diffusion_value in diffusion_coefficients.items():
print(f"{element}: {diffusion_value:.4e} m²/s")
print(f"{diffusion_coefficients['Li']:.4e}")


if 'Li' in diffusion_coefficients:
# If sys_name does not exist in results, initialize it
if sys_name not in results:
results[sys_name] = {}
# If temperature does not exist under the sys_name, initialize it
if temperature not in results[sys_name]:
results[sys_name][temperature] = {}
# Assign the disorder with its diffusion coefficient under the correct system name and temperature
results[sys_name][temperature][disorder] = f"{diffusion_coefficients['Li']:.4e}"

# Only append to our data if Lithium is in the diffusion_coefficients
data_for_df.append({
'disorder': disorder,
'temperature': temperature,
'Li_diffusion_coefficient': f"{diffusion_coefficients['Li']:.4e}"
})

# Create a pandas DataFrame from the collected data
df = pd.DataFrame(data_for_df)

# Sort the DataFrame by 'disorder' and then by 'temperature'
df_sorted = df.sort_values(by=['disorder', 'temperature'], ascending=[True, True])

# Convert the sorted DataFrame to markdown table string
markdown_table = df_sorted.to_markdown(index=False)

# Print the markdown table
print(markdown_table)

'''
# Save the nested dictionary to a JSON file with the desired structure
with open("../chgnet_merged.json", "w") as outfile:
json.dump(results, outfile, indent=2)

# Inform the user that the data has been saved
print("Data has been successfully saved to ../chgnet_merged.json.")'''
['LGPS_order/500k/chgnet_500_nvt.traj']
S: 1.0660e-12 m²/s
P: 7.4648e-13 m²/s
Li: 2.0311e-09 m²/s
Ge: 1.6669e-12 m²/s
2.0311e-09
| disorder   |   temperature |   Li_diffusion_coefficient |
|:-----------|--------------:|---------------------------:|
| LGPS_order |           500 |                 2.0311e-09 |
'\n# Save the nested dictionary to a JSON file with the desired structure\nwith open("../chgnet_merged.json", "w") as outfile:\n    json.dump(results, outfile, indent=2)\n\n# Inform the user that the data has been saved\nprint("Data has been successfully saved to ../chgnet_merged.json.")'
代码
文本

等回传完成后,用以下代码计算扩散系数.


After the return is completed, use the following code to calculate the diffusion coefficient.

代码
文本

3.3 离子电导率 Ionic Conductivity

代码
文本
[34]
cd ../
代码
文本
[41]

import numpy as np
from scipy.stats import linregress
import json
from ase.io import read
import glob

# Constants definition
z = 1 # Valence state
e = 1.60e-19 # Electron charge (C)
kB = 1.380649e-23 # Boltzmann constant (J/K)
unit_conversion = 10 # Unit conversion from SI to mS/cm (S/m)
R = 8.314 # Ideal gas constant (J/(mol*K))

# Function to calculate conductivity based on N,V,T,d
def calculate_conductivity(N, V, T, d):
A3_to_m3 = 1e-30
sigma = N * z**2 * e**2 / (V * A3_to_m3 * kB * T) * d * unit_conversion
return sigma

# Function to load JSON data from a filepath
def load_json_data(filepath):
with open(filepath, 'r', encoding='utf-8') as file:
return json.load(file)

# Function to read volume and particle number using pymatgen
def analyze_traj(traj_file):
atoms = read(traj_file, index=0)
volume = atoms.get_volume()
num_Li = sum(atom.symbol == 'Li' for atom in atoms)
return volume, num_Li

def calculate_average_std(conductivity_values):
# Calculate the average and standard deviation of the conductivity values
average = np.mean(conductivity_values)
std = np.std(conductivity_values)
return average, std

# Function to process JSON data and print conductivity values
def process_json_data(json_data, model_name):
for material, data in json_data.items():
temperatures = []
average_conductivities = []
for temp_str, disorders in data.items():
temperature = int(temp_str)
temperatures.append(temperature)
disorder_names = list(disorders.keys())
conductivity_values = []
for disorder in disorder_names:
try:
traj = glob.glob(f'./{model_name}/{disorder}/{temperature}k/*nvt.traj')[0]
V, N = analyze_traj(traj)
d = float(disorders[disorder])
conductivity = calculate_conductivity(N, V, temperature, d)
conductivity_values.append(conductivity)
except Exception as e:
pass
if conductivity_values:
average_conductivity, std_conductivity = calculate_average_std(conductivity_values)
average_conductivities.append(average_conductivity)
print(f'{material}, {temperature}, {average_conductivity:.1f} ± {std_conductivity:.1f} mS/cm')



# Load data
chgnet_data = load_json_data('chgnet_merged.json')
m3gnet_data = load_json_data('m3gnet_merged.json')

# Process and print the results
print('CHGNET data results:')
process_json_data(chgnet_data, 'chgnet')

print('M3GNET data results:')
process_json_data(m3gnet_data, 'm3gnet')

CHGNET data results:
LGPS, 500, 1608.8 ± 0.0 mS/cm
M3GNET data results:
LGPS, 500, 3479.9 ± 0.0 mS/cm
代码
文本

4. 总结 Summarization

基于硫族化合物固态电解质的基本物质组成,我们对比了DPA-2预训练模型和通用力场(CHGNET、M3GNET)在能量受力的推理和扩散性质预测的表现,我们发现DPA模型精度更高, 能更准确预测动力学性质(扩散系数和离子电导率),有助于辅助实验设计/改性。


Based on the basic material composition of chalcogenide solid electrolytes, we compared the performance of the DPA-2 pre-trained model and the general force field (CHGNET, M3GNET) in energy force reasoning and diffusion property prediction. We found that the DPA model has higher accuracy and can more accurately predict kinetic properties (diffusion coefficient and ionic conductivity), which helps to assist experimental design/modification.

代码
文本
notebook
notebook
已赞2
本文被以下合集收录
ML-PES
bohr4a0048
更新于 2024-09-10
6 篇0 人关注
DEEPMD
bohrb1cdbc
更新于 2024-08-25
12 篇0 人关注
推荐阅读
公开
现代化学设计实验课程
notebook2403-计算材料学原理分子动力学量子化学
notebook2403-计算材料学原理分子动力学量子化学
zzh
更新于 2024-06-17
107 赞43 转存文件
公开
杨飞宇-第1天-2403-计算材料学原理
2403-计算材料学原理
2403-计算材料学原理
f
发布于 2024-03-09