Quick Start DeePMD-kit | Train Methane Deep Potential Molecular Dynamics Model
©️ Copyright 2024 @ Authors
Date: 2024-07-05
Sharing Agreement: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Quick Start: Click the Connect button above, select the deepmd-kit:2.2.1-cuda11.6-notebook image and c8_m16_cpu node configuration, and wait for a moment to run.
Deep Potential is a collision between machine learning and physical principles, which presents a new computational paradigm as shown in the figure below.
Figure | A new computational paradigm consisting of molecular modeling, machine learning, and high-performance computing (HPC).
If you need a deeper understanding of the Deep Potential, you can click 👉 From DFT to MD: A Comprehensive 「Deep Potential」 Guide to Getting Started with Materials Computation
Objective
Master the paradigm cycle of using DeePMD-kit to build deep potential molecular dynamics models and learn how to apply them to molecular dynamics tasks with complete examples.
After learning this tutorial, you will be able to:
- Understand the data formats and run scripts required for training DeePMD-kit
- Train, freeze, compress, and test DeePMD-kit models
- Call DeePMD-kit for calculations in the molecular dynamics software LAMMPS.
For more information on DeePMD learning materials, please refer to the following two courses:
Reading this tutorial will take 【at most】about 20 minutes, so let's get started!
Background
In this tutorial, we will use gaseous methane molecules as an example to provide a detailed introduction to the training and application of Deep Potential (DP) models.
DeePMD-kit is a software based on fitting first-principles data with neural networks to obtain potential energy models for molecular dynamics simulations. Without manual intervention, it can convert the data provided by users into deep potential energy models in several hours end-to-end. This model can seamlessly integrate with common molecular dynamics simulation software such as LAMMPS, OpenMM, and GROMACS.
By utilizing high-performance computing and machine learning, DeePMD-kit has greatly extended the limit of molecular dynamics simulations by several orders of magnitude, reaching system scales with billions of atoms, while still ensuring high accuracy of "ab initio" calculations. The simulation time scale is at least 1000 times higher than traditional methods. The related achievements won the 2020 ACM Gordon Bell Prize, the highest award in the field of high-performance computing, and have been used by thousands of research groups in physics, chemistry, materials, biology, and other fields.
For more detailed usage, you can refer to the documentation of DeePMD-kit as a complete reference.
In this case, the Deep Potential (DP) model was generated using the DeePMD-kit package (v2.2.1).
The current path is: /personal/bohr/deepmd-kit-8n4p/v1
Let's take a look at the downloaded DeePMD-kit_Tutorial folder.
DeePMD-kit_Tutorial ├── 00.data ├── 01.train └── 02.lmp 3 directories, 0 files
There are three sub-folders in the DeePMD-kit_Tutorial folder: 00.data, 01.train, and 02.lmp.
- The 00.data folder is used to store training and testing data.
- The 01.train folder contains example scripts for training models using DeePMD-kit.
- The 02.lmp folder contains example scripts for molecular dynamics simulations using LAMMPS.
Let's first take a look at the DeePMD-kit_Tutorial/00.data folder.
DeePMD-kit_Tutorial/00.data ├── abacus_md ├── training_data └── validation_data 3 directories, 0 files
The training data of DeePMD-kit comes from first-principles calculation data, including atomic types, simulation lattice, atomic coordinates, atomic forces, system energy, and virial quantity.
Under the 00.data folder, there is only the abacus_md folder, which is obtained by using ABACUS to perform ab initio Molecular Dynamics (AIMD). In this tutorial, we have already completed the ab initio molecular dynamics calculation of methane molecules for you.
Detailed explanations about the ABACUS can be found in its documentation. You can also find help in Section 2 of In-depth "Deep Potential" Material Calculation Guide.
DeePMD-kit adopts a compressed data format. All training data should be first converted to this format before being used in DeePMD-kit. The data format is explained in detail in the DeePMD-kit manual, which can be found on DeePMD-kit Github.
We provide a convenient tool called dpdata to convert data generated by VASP, CP2K, Gaussian, Quantum-Espresso, ABACUS, and LAMMPS into the compressed format of DeePMD-kit.
A snapshot of a molecular system with computational data information is called a frame. The data system consists of many frames that share the same number of atoms and atomic types.
For example, a molecular dynamics trajectory can be converted into a data system, where each time step corresponds to a frame in the system.
Next, we use the dpdata tool to randomly split the data in abacus_md into training and validation data.
# The data contains 201 frames # The training data contains 161 frames # The validation data contains 40 frames
You can see that 161 frames were selected as training data, and the other 40 frames are validation data.
Let's take another look at the 00.data folder, where new files have been generated. These files are the training set and validation set required for DeePMD-kit deep potential training.
DeePMD-kit_Tutorial/00.data/ ├── abacus_md ├── training_data └── validation_data 3 directories, 0 files
DeePMD-kit_Tutorial/00.data/training_data ├── set.000 ├── type.raw └── type_map.raw 1 directory, 2 files
The roles of these files are as follows:
set.000
: It is a directory that contains data in a compressed format (NumPy compressed arrays).type.raw
: It is a file that contains the types of atoms (represented by integers).type_map.raw
: It is a file that contains the names of the atom types.
0 0 0 0 1
This tells us that there are 5 atoms in this example, with 4 atoms represented by the type "0" and 1 atom represented by the type "1". Sometimes it is necessary to map integer types to atom names. The mapping can be provided through the file type_map.raw
.
H C
This tells us that the type "0" is named "H" and the type "1" is named "C".
The detailed documentation for using dpdata for data conversion can be accessed HERE.
2 Preparing the Input Script
Once the training data is prepared, the next step is to proceed with the training. DeePMD-kit requires a json
formatted file to specify the training parameters. This file is known as the input script for DeePMD-kit. Let's navigate to the training directory to examine this input script:
{ "_comment": " model parameters", "model": { "type_map": ["H", "C"], "descriptor" :{ "type": "se_e2_a", "sel": "auto", "rcut_smth": 0.50, "rcut": 6.00, "neuron": [25, 50, 100], "resnet_dt": false, "axis_neuron": 16, "seed": 1, "_comment": " that's all" }, "fitting_net" : { "neuron": [240, 240, 240], "resnet_dt": true, "seed": 1, "_comment": " that's all" }, "_comment": " that's all" }, "learning_rate" :{ "type": "exp", "decay_steps": 50, "start_lr": 0.001, "stop_lr": 3.51e-8, "_comment": "that's all" }, "loss" :{ "type": "ener", "start_pref_e": 0.02, "limit_pref_e": 1, "start_pref_f": 1000, "limit_pref_f": 1, "start_pref_v": 0, "limit_pref_v": 0, "_comment": " that's all" }, "training" : { "training_data": { "systems": ["../00.data/training_data"], "batch_size": "auto", "_comment": "that's all" }, "validation_data":{ "systems": ["../00.data/validation_data"], "batch_size": "auto", "numb_btch": 1, "_comment": "that's all" }, "numb_steps": 10000, "seed": 10, "disp_file": "lcurve.out", "disp_freq": 200, "save_freq": 1000, "_comment": "that's all" }, "_comment": "that's all" }
In the model section, parameters for the embedding and fitting networks are specified.
"model":{
"type_map": ["H", "C"],
"descriptor":{
"type": "se_e2_a",
"rcut": 6.00,
"rcut_smth": 0.50,
"sel": "auto",
"neuron": [25, 50, 100],
"resnet_dt": false,
"axis_neuron": 16,
"seed": 1,
"_comment": "that's all"
},
"fitting_net":{
"neuron": [240, 240, 240],
"resnet_dt": true,
"seed": 1,
"_comment": "that's all"
},
"_comment": "that's all"'
},
Some of the parameters are explained as follows:
Parameter | Explanation |
---|---|
type_map | The name of each type of atom |
descriptor > type | The type of descriptor |
descriptor > rcut | The cutoff radius |
descriptor > rcut_smth | The position where smoothing begins |
descriptor > sel | The maximum number of the i-th type of atom within the cutoff radius |
descriptor > neuron | The size of the embedding neural network |
descriptor > axis_neuron | The size of the G-matrix's submatrix (embedding matrix) |
fitting_net > neuron | The size of the fitting neural network |
Use the se_e2_a
descriptor to train the DP model. The neurons
parameter sets the sizes of the descriptor and fitting network to [25, 50, 100] and [240, 240, 240], respectively. The components in the local environment will smoothly approach zero within the range of 0.5 to 6 Å.
The following are the parameters specifying the learning rate and loss function.
"learning_rate" :{
"type": "exp",
"decay_steps": 50,
"start_lr": 0.001,
"stop_lr": 3.51e-8,
"_comment": "that's all"
},
"loss" :{
"type": "ener",
"start_pref_e": 0.02,
"limit_pref_e": 1,
"start_pref_f": 1000,
"limit_pref_f": 1,
"start_pref_v": 0,
"limit_pref_v": 0,
"_comment": "that's all"
},
In the loss function, pref_e
gradually increases from 0.02 to 1, and pref_f
gradually decreases from 1000 to 1, which means that the force term dominates at the beginning, while the energy and pressure terms become important at the end. This strategy is very effective and reduces the total training time. pref_v
is set to 0, indicating that pressure data is not included during the training process. The initial learning rate, final learning rate, and decay steps are set to 0.001, 3.51e-8, and 50, respectively. The model is trained for 10000 steps.
The training parameters are shown as follows:
"training" : {
"training_data": {
"systems": ["../00.data/training_data"],
"batch_size": "auto",
"_comment": "that's all"
},
"validation_data":{
"systems": ["../00.data/validation_data/"],
"batch_size": "auto",
"numb_btch": 1,
"_comment": "that's all"
},
"numb_steps": 10000,
"seed": 10,
"disp_file": "lcurve.out",
"disp_freq": 200,
"save_freq": 10000,
},
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information. WARNING:root:Environment variable KMP_BLOCKTIME is empty. Use the default value 0 WARNING:root:Environment variable KMP_AFFINITY is empty. Use the default value granularity=fine,verbose,compact,1,0 /opt/deepmd-kit-2.2.1/lib/python3.10/importlib/__init__.py:169: UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged. _bootstrap._exec(spec, module) DEEPMD INFO Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step) 2024-07-10 02:20:00.763483: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-07-10 02:20:00.763528: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303) OMP: Info #155: KMP_AFFINITY: Initial OS proc set respected: 0,1 OMP: Info #216: KMP_AFFINITY: decoding x2APIC ids. OMP: Info #157: KMP_AFFINITY: 2 available OS procs OMP: Info #158: KMP_AFFINITY: Uniform topology OMP: Info #287: KMP_AFFINITY: topology layer "LL cache" is equivalent to "socket". OMP: Info #287: KMP_AFFINITY: topology layer "L3 cache" is equivalent to "socket". OMP: Info #287: KMP_AFFINITY: topology layer "L2 cache" is equivalent to "socket". OMP: Info #287: KMP_AFFINITY: topology layer "L1 cache" is equivalent to "socket". OMP: Info #192: KMP_AFFINITY: 1 socket x 1 core/socket x 2 threads/core (1 total cores) OMP: Info #218: KMP_AFFINITY: OS proc to physical thread map: OMP: Info #172: KMP_AFFINITY: OS proc 0 maps to socket 0 core 0 thread 0 OMP: Info #172: KMP_AFFINITY: OS proc 1 maps to socket 0 core 0 thread 1 OMP: Info #254: KMP_AFFINITY: pid 99 tid 109 thread 1 bound to OS proc set 1 OMP: Info #254: KMP_AFFINITY: pid 99 tid 111 thread 2 bound to OS proc set 0 OMP: Info #254: KMP_AFFINITY: pid 99 tid 108 thread 3 bound to OS proc set 1 OMP: Info #254: KMP_AFFINITY: pid 99 tid 112 thread 4 bound to OS proc set 0 DEEPMD INFO training data with min nbor dist: 1.0460506586976834 DEEPMD INFO training data with max nbor size: [4 1] DEEPMD INFO _____ _____ __ __ _____ _ _ _ DEEPMD INFO | __ \ | __ \ | \/ || __ \ | | (_)| | DEEPMD INFO | | | | ___ ___ | |__) || \ / || | | | ______ | | __ _ | |_ DEEPMD INFO | | | | / _ \ / _ \| ___/ | |\/| || | | ||______|| |/ /| || __| DEEPMD INFO | |__| || __/| __/| | | | | || |__| | | < | || |_ DEEPMD INFO |_____/ \___| \___||_| |_| |_||_____/ |_|\_\|_| \__| DEEPMD INFO Please read and cite: DEEPMD INFO Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018) DEEPMD INFO installed to: /home/conda/feedstock_root/build_artifacts/deepmd-kit_1678943793317/work/_skbuild/linux-x86_64-3.10/cmake-install DEEPMD INFO source : v2.2.1 DEEPMD INFO source brach: HEAD DEEPMD INFO source commit: 3ac8c4c7 DEEPMD INFO source commit at: 2023-03-16 12:33:24 +0800 DEEPMD INFO build float prec: double DEEPMD INFO build variant: cuda DEEPMD INFO build with tf inc: /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/include;/opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/../../../../include DEEPMD INFO build with tf lib: DEEPMD INFO ---Summary of the training--------------------------------------- DEEPMD INFO running on: bohrium-16664-1159987 DEEPMD INFO computing device: cpu:0 DEEPMD INFO CUDA_VISIBLE_DEVICES: unset DEEPMD INFO Count of visible GPU: 0 DEEPMD INFO num_intra_threads: 0 DEEPMD INFO num_inter_threads: 0 DEEPMD INFO ----------------------------------------------------------------- DEEPMD INFO ---Summary of DataSystem: training ----------------------------------------------- DEEPMD INFO found 1 system(s): DEEPMD INFO system natoms bch_sz n_bch prob pbc DEEPMD INFO ../00.data/training_data 5 7 23 1.000 T DEEPMD INFO -------------------------------------------------------------------------------------- DEEPMD INFO ---Summary of DataSystem: validation ----------------------------------------------- DEEPMD INFO found 1 system(s): DEEPMD INFO system natoms bch_sz n_bch prob pbc DEEPMD INFO ../00.data/validation_data 5 7 5 1.000 T DEEPMD INFO -------------------------------------------------------------------------------------- DEEPMD INFO training without frame parameter DEEPMD INFO data stating... (this step may take long time) OMP: Info #254: KMP_AFFINITY: pid 99 tid 99 thread 0 bound to OS proc set 0 DEEPMD INFO built lr DEEPMD INFO built network DEEPMD INFO built training WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information. DEEPMD INFO initialize model from scratch DEEPMD INFO start training at lr 1.00e-03 (== 1.00e-03), decay_step 50, decay_rate 0.950006, final lr will be 3.51e-08 DEEPMD INFO batch 200 training time 11.36 s, testing time 0.04 s DEEPMD INFO batch 400 training time 10.01 s, testing time 0.04 s DEEPMD INFO batch 600 training time 9.97 s, testing time 0.03 s DEEPMD INFO batch 800 training time 10.01 s, testing time 0.04 s DEEPMD INFO batch 1000 training time 10.07 s, testing time 0.04 s DEEPMD INFO saved checkpoint model.ckpt DEEPMD INFO batch 1200 training time 10.15 s, testing time 0.04 s DEEPMD INFO batch 1400 training time 10.05 s, testing time 0.04 s DEEPMD INFO batch 1600 training time 10.13 s, testing time 0.04 s DEEPMD INFO batch 1800 training time 10.06 s, testing time 0.04 s DEEPMD INFO batch 2000 training time 10.02 s, testing time 0.04 s DEEPMD INFO saved checkpoint model.ckpt DEEPMD INFO batch 2200 training time 10.12 s, testing time 0.04 s DEEPMD INFO batch 2400 training time 10.18 s, testing time 0.04 s DEEPMD INFO batch 2600 training time 10.01 s, testing time 0.04 s DEEPMD INFO batch 2800 training time 10.06 s, testing time 0.04 s DEEPMD INFO batch 3000 training time 10.05 s, testing time 0.04 s DEEPMD INFO saved checkpoint model.ckpt DEEPMD INFO batch 3200 training time 10.02 s, testing time 0.03 s DEEPMD INFO batch 3400 training time 10.10 s, testing time 0.05 s DEEPMD INFO batch 3600 training time 10.08 s, testing time 0.04 s DEEPMD INFO batch 3800 training time 10.05 s, testing time 0.03 s DEEPMD INFO batch 4000 training time 10.11 s, testing time 0.03 s DEEPMD INFO saved checkpoint model.ckpt DEEPMD INFO batch 4200 training time 9.99 s, testing time 0.04 s DEEPMD INFO batch 4400 training time 10.01 s, testing time 0.04 s DEEPMD INFO batch 4600 training time 10.00 s, testing time 0.03 s DEEPMD INFO batch 4800 training time 9.99 s, testing time 0.04 s DEEPMD INFO batch 5000 training time 9.98 s, testing time 0.04 s DEEPMD INFO saved checkpoint model.ckpt DEEPMD INFO batch 5200 training time 10.01 s, testing time 0.04 s DEEPMD INFO batch 5400 training time 9.99 s, testing time 0.04 s DEEPMD INFO batch 5600 training time 9.96 s, testing time 0.03 s DEEPMD INFO batch 5800 training time 10.06 s, testing time 0.04 s DEEPMD INFO batch 6000 training time 10.04 s, testing time 0.03 s WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/training/saver.py:1066: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to delete files with this prefix. WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/training/saver.py:1066: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to delete files with this prefix. DEEPMD INFO saved checkpoint model.ckpt DEEPMD INFO batch 6200 training time 10.08 s, testing time 0.04 s DEEPMD INFO batch 6400 training time 10.10 s, testing time 0.03 s DEEPMD INFO batch 6600 training time 10.10 s, testing time 0.04 s DEEPMD INFO batch 6800 training time 10.12 s, testing time 0.04 s DEEPMD INFO batch 7000 training time 10.18 s, testing time 0.04 s DEEPMD INFO saved checkpoint model.ckpt DEEPMD INFO batch 7200 training time 10.06 s, testing time 0.04 s DEEPMD INFO batch 7400 training time 10.08 s, testing time 0.04 s DEEPMD INFO batch 7600 training time 10.10 s, testing time 0.04 s DEEPMD INFO batch 7800 training time 10.06 s, testing time 0.04 s DEEPMD INFO batch 8000 training time 10.13 s, testing time 0.04 s DEEPMD INFO saved checkpoint model.ckpt DEEPMD INFO batch 8200 training time 10.16 s, testing time 0.03 s DEEPMD INFO batch 8400 training time 10.04 s, testing time 0.03 s DEEPMD INFO batch 8600 training time 10.06 s, testing time 0.04 s DEEPMD INFO batch 8800 training time 10.13 s, testing time 0.04 s DEEPMD INFO batch 9000 training time 10.08 s, testing time 0.04 s DEEPMD INFO saved checkpoint model.ckpt DEEPMD INFO batch 9200 training time 10.19 s, testing time 0.04 s DEEPMD INFO batch 9400 training time 10.08 s, testing time 0.04 s DEEPMD INFO batch 9600 training time 10.06 s, testing time 0.04 s DEEPMD INFO batch 9800 training time 10.09 s, testing time 0.04 s DEEPMD INFO batch 10000 training time 10.08 s, testing time 0.04 s DEEPMD INFO saved checkpoint model.ckpt DEEPMD INFO average training time: 0.0503 s/batch (exclude first 200 batches) DEEPMD INFO finished training DEEPMD INFO wall time: 520.318 s
The information from the data system will be displayed on the screen, for example:
DEEPMD INFO -----------------------------------------------------------------
DEEPMD INFO ---Summary of DataSystem: training ----------------------------------
DEEPMD INFO found 1 system(s):
DEEPMD INFO system natoms bch_sz n_bch prob pbc
DEEPMD INFO ../00.data/training_data 5 7 23 1.000 T
DEEPMD INFO -------------------------------------------------------------------------
DEEPMD INFO ---Summary of DataSystem: validation ----------------------------------
DEEPMD INFO found 1 system(s):
DEEPMD INFO system natoms bch_sz n_bch prob pbc
DEEPMD INFO ../00.data/validation_data 5 7 5 1.000 T
DEEPMD INFO -------------------------------------------------------------------------
And the initial and final learning rates for this training:
DEEPMD INFO start training at lr 1.00e-03 (== 1.00e-03), decay_step 50, decay_rate 0.950006, final lr will be 3.51e-08
If everything goes well, you will see information printed every 200 batches, for example:
DEEPMD INFO batch 200 training time 6.04 s, testing time 0.02 s
DEEPMD INFO batch 400 training time 4.80 s, testing time 0.02 s
DEEPMD INFO batch 600 training time 4.80 s, testing time 0.02 s
DEEPMD INFO batch 800 training time 4.78 s, testing time 0.02 s
DEEPMD INFO batch 1000 training time 4.77 s, testing time 0.02 s
DEEPMD INFO saved checkpoint model.ckpt
DEEPMD INFO batch 1200 training time 4.47 s, testing time 0.02 s
DEEPMD INFO batch 1400 training time 4.49 s, testing time 0.02 s
DEEPMD INFO batch 1600 training time 4.45 s, testing time 0.02 s
DEEPMD INFO batch 1800 training time 4.44 s, testing time 0.02 s
DEEPMD INFO batch 2000 training time 4.46 s, testing time 0.02 s
DEEPMD INFO saved checkpoint model.ckpt
They display the count of training and testing times. At the end of every 1000 batches, the model will be saved in the Tensorflow checkpoint file model.ckpt
.
At the same time, the training and testing errors will be presented in the file lcurve.out
. This file contains 8 columns, from left to right they are:
- Training step
- Validation loss
- Training loss
- Root mean square (RMS) validation error of energy
- RMS training error of energy
- RMS validation error of force
- RMS training error of force
- Learning rate
Learning rate is an important concept in machine learning. In the DP model, the learning rate undergoes an exponential decay process from large to small. This ensures both the efficiency of model convergence and the accuracy of the model. Therefore, in the learning rate parameters, there are two kinds: the initial learning rate (start_lr) and the final learning rate (end_rate). In the example above, we set the initial learning rate, final learning rate, and the decay step of the learning rate to 0.001, 3.51e-8, and 50, respectively. So the model learning rate will start from 0.001, decrease a bit every 50 steps, until it decreases to 3.51e-8 (or until the training ends).
Let's take a look at the beginning and end lines of the lcurve.out file.
# step rmse_val rmse_trn rmse_e_val rmse_e_trn rmse_f_val rmse_f_trn lr 0 1.98e+01 1.93e+01 1.38e-01 1.36e-01 6.26e-01 6.09e-01 1.0e-03 9800 4.68e-02 5.20e-02 9.18e-04 7.10e-04 4.57e-02 5.09e-02 4.3e-08 10000 5.53e-02 4.32e-02 7.53e-04 7.82e-04 5.44e-02 4.25e-02 3.5e-08
The loss function can be visualized to monitor the training process.
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Requirement already satisfied: matplotlib in /opt/mamba/lib/python3.10/site-packages (3.7.1) Requirement already satisfied: python-dateutil>=2.7 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (2.8.2) Requirement already satisfied: kiwisolver>=1.0.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (1.4.4) Requirement already satisfied: cycler>=0.10 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (0.11.0) Requirement already satisfied: packaging>=20.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (23.0) Requirement already satisfied: pyparsing>=2.3.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (3.0.9) Requirement already satisfied: fonttools>=4.22.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (4.39.3) Requirement already satisfied: numpy>=1.20 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (1.24.2) Requirement already satisfied: pillow>=6.2.0 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (9.5.0) Requirement already satisfied: contourpy>=1.0.1 in /opt/mamba/lib/python3.10/site-packages (from matplotlib) (1.0.7) Requirement already satisfied: six>=1.5 in /opt/mamba/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0) WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information. WARNING:root:Environment variable KMP_BLOCKTIME is empty. Use the default value 0 WARNING:root:Environment variable KMP_AFFINITY is empty. Use the default value granularity=fine,verbose,compact,1,0 /opt/deepmd-kit-2.2.1/lib/python3.10/importlib/__init__.py:169: UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged. _bootstrap._exec(spec, module) 2024-07-10 02:29:37.570599: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-07-10 02:29:37.570641: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303) DEEPMD INFO The following nodes will be frozen: ['model_type', 'descrpt_attr/rcut', 'descrpt_attr/ntypes', 'model_attr/tmap', 'model_attr/model_type', 'model_attr/model_version', 'train_attr/min_nbor_dist', 'train_attr/training_script', 'o_energy', 'o_force', 'o_virial', 'o_atom_energy', 'o_atom_virial', 'fitting_attr/dfparam', 'fitting_attr/daparam'] WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/deepmd/entrypoints/freeze.py:354: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.compat.v1.graph_util.convert_variables_to_constants` WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/deepmd/entrypoints/freeze.py:354: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.compat.v1.graph_util.convert_variables_to_constants` WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/framework/convert_to_constants.py:925: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.compat.v1.graph_util.extract_sub_graph` WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/framework/convert_to_constants.py:925: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.compat.v1.graph_util.extract_sub_graph` DEEPMD INFO 1211 ops in the final graph.
It will output a model file named graph.pb in the current directory.
So far, we have obtained a deep potential energy model obtained through high-precision ab initio molecular dynamics data using DeePMD-kit: DeePMD-kit_Tutorial/01.train/graph.pb
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information. WARNING:root:Environment variable KMP_BLOCKTIME is empty. Use the default value 0 WARNING:root:Environment variable KMP_AFFINITY is empty. Use the default value granularity=fine,verbose,compact,1,0 /opt/deepmd-kit-2.2.1/lib/python3.10/importlib/__init__.py:169: UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged. _bootstrap._exec(spec, module) 2024-07-10 02:37:20.748939: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-07-10 02:37:20.748999: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303) DEEPMD INFO DEEPMD INFO stage 1: compress the model DEEPMD INFO _____ _____ __ __ _____ _ _ _ DEEPMD INFO | __ \ | __ \ | \/ || __ \ | | (_)| | DEEPMD INFO | | | | ___ ___ | |__) || \ / || | | | ______ | | __ _ | |_ DEEPMD INFO | | | | / _ \ / _ \| ___/ | |\/| || | | ||______|| |/ /| || __| DEEPMD INFO | |__| || __/| __/| | | | | || |__| | | < | || |_ DEEPMD INFO |_____/ \___| \___||_| |_| |_||_____/ |_|\_\|_| \__| DEEPMD INFO Please read and cite: DEEPMD INFO Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018) DEEPMD INFO installed to: /home/conda/feedstock_root/build_artifacts/deepmd-kit_1678943793317/work/_skbuild/linux-x86_64-3.10/cmake-install DEEPMD INFO source : v2.2.1 DEEPMD INFO source brach: HEAD DEEPMD INFO source commit: 3ac8c4c7 DEEPMD INFO source commit at: 2023-03-16 12:33:24 +0800 DEEPMD INFO build float prec: double DEEPMD INFO build variant: cuda DEEPMD INFO build with tf inc: /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/include;/opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/../../../../include DEEPMD INFO build with tf lib: DEEPMD INFO ---Summary of the training--------------------------------------- DEEPMD INFO running on: bohrium-16664-1159987 DEEPMD INFO computing device: cpu:0 DEEPMD INFO CUDA_VISIBLE_DEVICES: unset DEEPMD INFO Count of visible GPU: 0 DEEPMD INFO num_intra_threads: 0 DEEPMD INFO num_inter_threads: 0 DEEPMD INFO ----------------------------------------------------------------- DEEPMD INFO training without frame parameter DEEPMD INFO training data with lower boundary: [-0.9291997 -0.99961742] DEEPMD INFO training data with upper boundary: [1.97524068 1.1051857 ] OMP: Info #155: KMP_AFFINITY: Initial OS proc set respected: 0,1 OMP: Info #216: KMP_AFFINITY: decoding x2APIC ids. OMP: Info #157: KMP_AFFINITY: 2 available OS procs OMP: Info #158: KMP_AFFINITY: Uniform topology OMP: Info #287: KMP_AFFINITY: topology layer "LL cache" is equivalent to "socket". OMP: Info #287: KMP_AFFINITY: topology layer "L3 cache" is equivalent to "socket". OMP: Info #287: KMP_AFFINITY: topology layer "L2 cache" is equivalent to "socket". OMP: Info #287: KMP_AFFINITY: topology layer "L1 cache" is equivalent to "socket". OMP: Info #192: KMP_AFFINITY: 1 socket x 1 core/socket x 2 threads/core (1 total cores) OMP: Info #218: KMP_AFFINITY: OS proc to physical thread map: OMP: Info #172: KMP_AFFINITY: OS proc 0 maps to socket 0 core 0 thread 0 OMP: Info #172: KMP_AFFINITY: OS proc 1 maps to socket 0 core 0 thread 1 OMP: Info #254: KMP_AFFINITY: pid 179 tid 179 thread 0 bound to OS proc set 0 DEEPMD INFO built lr DEEPMD INFO built network DEEPMD INFO built training WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information. DEEPMD INFO initialize model from scratch DEEPMD INFO finished compressing DEEPMD INFO DEEPMD INFO stage 2: freeze the model DEEPMD INFO The following nodes will be frozen: ['model_type', 'descrpt_attr/rcut', 'descrpt_attr/ntypes', 'model_attr/tmap', 'model_attr/model_type', 'model_attr/model_version', 'train_attr/min_nbor_dist', 'train_attr/training_script', 'o_energy', 'o_force', 'o_virial', 'o_atom_energy', 'o_atom_virial', 'fitting_attr/dfparam', 'fitting_attr/daparam'] WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/deepmd/entrypoints/freeze.py:354: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.compat.v1.graph_util.convert_variables_to_constants` WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/deepmd/entrypoints/freeze.py:354: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.compat.v1.graph_util.convert_variables_to_constants` WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/framework/convert_to_constants.py:925: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.compat.v1.graph_util.extract_sub_graph` WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/framework/convert_to_constants.py:925: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.compat.v1.graph_util.extract_sub_graph` DEEPMD INFO 847 ops in the final graph.
WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information. WARNING:root:Environment variable KMP_BLOCKTIME is empty. Use the default value 0 WARNING:root:Environment variable KMP_AFFINITY is empty. Use the default value granularity=fine,verbose,compact,1,0 /opt/deepmd-kit-2.2.1/lib/python3.10/importlib/__init__.py:169: UserWarning: The NumPy module was reloaded (imported a second time). This can in some cases result in small but subtle issues and is discouraged. _bootstrap._exec(spec, module) 2024-07-10 02:37:51.740912: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-07-10 02:37:51.740961: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303) WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/deepmd/utils/batch_size.py:61: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.config.list_physical_devices('GPU')` instead. WARNING:tensorflow:From /opt/deepmd-kit-2.2.1/lib/python3.10/site-packages/deepmd/utils/batch_size.py:61: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.config.list_physical_devices('GPU')` instead. DEEPMD WARNING You can use the environment variable DP_INFER_BATCH_SIZE tocontrol the inference batch size (nframes * natoms). The default value is 1024. DEEPMD INFO # ---------------output of dp test--------------- DEEPMD INFO # testing system : ../00.data/validation_data OMP: Info #155: KMP_AFFINITY: Initial OS proc set respected: 0,1 OMP: Info #216: KMP_AFFINITY: decoding x2APIC ids. OMP: Info #157: KMP_AFFINITY: 2 available OS procs OMP: Info #158: KMP_AFFINITY: Uniform topology OMP: Info #287: KMP_AFFINITY: topology layer "LL cache" is equivalent to "socket". OMP: Info #287: KMP_AFFINITY: topology layer "L3 cache" is equivalent to "socket". OMP: Info #287: KMP_AFFINITY: topology layer "L2 cache" is equivalent to "socket". OMP: Info #287: KMP_AFFINITY: topology layer "L1 cache" is equivalent to "socket". OMP: Info #192: KMP_AFFINITY: 1 socket x 1 core/socket x 2 threads/core (1 total cores) OMP: Info #218: KMP_AFFINITY: OS proc to physical thread map: OMP: Info #172: KMP_AFFINITY: OS proc 0 maps to socket 0 core 0 thread 0 OMP: Info #172: KMP_AFFINITY: OS proc 1 maps to socket 0 core 0 thread 1 OMP: Info #254: KMP_AFFINITY: pid 192 tid 196 thread 1 bound to OS proc set 1 OMP: Info #254: KMP_AFFINITY: pid 192 tid 199 thread 2 bound to OS proc set 0 DEEPMD INFO # number of test data : 40 DEEPMD INFO Energy MAE : 3.360696e-03 eV DEEPMD INFO Energy RMSE : 4.048368e-03 eV DEEPMD INFO Energy MAE/Natoms : 6.721391e-04 eV DEEPMD INFO Energy RMSE/Natoms : 8.096737e-04 eV DEEPMD INFO Force MAE : 3.785041e-02 eV/A DEEPMD INFO Force RMSE : 4.851145e-02 eV/A DEEPMD INFO Virial MAE : 5.403371e-02 eV DEEPMD INFO Virial RMSE : 6.898294e-02 eV DEEPMD INFO Virial MAE/Natoms : 1.080674e-02 eV DEEPMD INFO Virial RMSE/Natoms : 1.379659e-02 eV DEEPMD INFO # -----------------------------------------------
Let's calculate the correlation between the predicted data and the original data and visualize it.
2024-07-10 02:38:35.031947: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-07-10 02:38:37.254219: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-07-10 02:38:37.255107: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-07-10 02:38:37.255123: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. WARNING:tensorflow:From /opt/mamba/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version. Instructions for updating: non-resource variables are not supported in the long term WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information. WARNING:tensorflow:From /opt/mamba/lib/python3.10/site-packages/deepmd/utils/batch_size.py:61: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.config.list_physical_devices('GPU')` instead. 2024-07-10 02:38:39.548987: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-07-10 02:38:39.551735: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-07-10 02:38:39.551776: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303) 2024-07-10 02:38:39.551799: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (bohrium-16664-1159987): /proc/driver/nvidia/version does not exist 2024-07-10 02:38:39.569416: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled WARNING:tensorflow:From /opt/mamba/lib/python3.10/site-packages/deepmd/utils/batch_size.py:61: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.config.list_physical_devices('GPU')` instead. WARNING:deepmd.utils.batch_size:You can use the environment variable DP_INFER_BATCH_SIZE tocontrol the inference batch size (nframes * natoms). The default value is 1024.
. ├── conf.lmp ├── graph.pb └── in.lammps 0 directories, 3 files
Here conf.lmp
gives the initial configuration for the molecular dynamics simulation of gas phase methane.
The file in.lammps
is an input script for LAMMPS. You can check in.lammps
and find that it is quite a standard LAMMPS molecular dynamics simulation input file (for more information on LAMMPS molecular dynamics simulation input files, you can read 「A Very Detailed Guide to "Deep Potential" Material Calculations | Chapter 1」).
There are only two exceptions:
pair_style deepmd graph.pb
pair_coeff * *
Here, the pair_style
of DeePMD is called, providing the model file graph.pb
, which means that the interatomic interactions will be calculated by the DP model stored in the file graph.pb
.
In an environment with a compatible version of LAMMPS, deep potential molecular dynamics simulation can be executed with the following command:
Warning: This LAMMPS executable is in a conda environment, but the environment has not been activated. Libraries may fail to load. To activate this environment please see https://conda.io/activation. LAMMPS (23 Jun 2022 - Update 1) OMP_NUM_THREADS environment is not set. Defaulting to 1 thread. (src/comm.cpp:98) using 1 OpenMP thread(s) per MPI task Loaded 1 plugins from /opt/deepmd-kit-2.2.1/lib/deepmd_lmp Reading data file ... triclinic box = (0 0 0) to (10.114259 10.263124 10.216793) with tilt (0.036749877 0.13833062 -0.056322169) 1 by 1 by 1 MPI processor grid reading atoms ... 5 atoms read_data CPU = 0.012 seconds DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information. Summary of lammps deepmd module ... >>> Info of deepmd-kit: installed to: /opt/deepmd-kit-2.2.1 source: v2.2.1 source branch: HEAD source commit: 3ac8c4c7 source commit at: 2023-03-16 12:33:24 +0800 surpport model ver.:1.1 build variant: cuda build with tf inc: /opt/deepmd-kit-2.2.1/include;/opt/deepmd-kit-2.2.1/include build with tf lib: /opt/deepmd-kit-2.2.1/lib/libtensorflow_cc.so set tf intra_op_parallelism_threads: 0 set tf inter_op_parallelism_threads: 0 >>> Info of lammps module: use deepmd-kit at: /opt/deepmd-kit-2.2.1DeePMD-kit WARNING: Environmental variable OMP_NUM_THREADS is not set. Tune OMP_NUM_THREADS for the best performance. See https://deepmd.rtfd.io/parallelism/ for more information. DeePMD-kit: Successfully load libcudart.so 2024-07-10 02:40:48.199790: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2024-07-10 02:40:48.205094: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 2024-07-10 02:40:48.205124: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303) 2024-07-10 02:40:48.205149: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (bohrium-16664-1159987): /proc/driver/nvidia/version does not exist 2024-07-10 02:40:48.206778: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance. 2024-07-10 02:40:48.259249: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled >>> Info of model(s): using 1 model(s): graph.pb rcut in model: 6 ntypes in model: 2 CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE Your simulation uses code contributions which should be cited: - USER-DEEPMD package: The log file lists these citations in BibTeX format. CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE Generated 0 of 1 mixed pair_coeff terms from geometric mixing rule Neighbor list info ... update every 10 steps, delay 0 steps, check no max neighbors/atom: 2000, page size: 100000 master list distance cutoff = 7 ghost atom cutoff = 7 binsize = 3.5, bins = 3 3 3 1 neighbor lists, perpetual/occasional/extra = 1 0 0 (1) pair deepmd, perpetual attributes: full, newton on pair build: full/bin/atomonly stencil: full/bin/3d bin: standard Setting up Verlet run ... Unit style : metal Current step : 0 Time step : 0.001 Per MPI rank memory allocation (min/avg/max) = 3.809 | 3.809 | 3.809 Mbytes Step PotEng KinEng TotEng Temp Press Volume 0 -219.77031 0.025852029 -219.74446 50 -822.49232 1060.5429 100 -219.76464 0.020095149 -219.74455 38.865709 -779.63202 1060.5429 200 -219.77378 0.027782066 -219.746 53.732854 -647.97367 1060.5429 300 -219.77702 0.029854226 -219.74717 57.740586 -350.10302 1060.5429 400 -219.78434 0.035056028 -219.74929 67.801309 113.88001 1060.5429 500 -219.78313 0.031576851 -219.75156 61.072287 580.6186 1060.5429 600 -219.7756 0.021923343 -219.75368 42.40159 812.3297 1060.5429 700 -219.78108 0.023820451 -219.75726 46.070758 732.91804 1060.5429 800 -219.78308 0.021974712 -219.7611 42.500943 498.86378 1060.5429 900 -219.78223 0.017310548 -219.76492 33.480058 226.54755 1060.5429 1000 -219.7867 0.018039368 -219.76866 34.889656 -99.405242 1060.5429 1100 -219.78739 0.015950019 -219.77144 30.84868 -368.47626 1060.5429 1200 -219.78804 0.014223823 -219.77382 27.510071 -538.23678 1060.5429 1300 -219.78442 0.009168641 -219.77525 17.732923 -554.96154 1060.5429 1400 -219.78821 0.011720654 -219.77649 22.668731 -465.29996 1060.5429 1500 -219.78975 0.012687905 -219.77706 24.539476 -279.10681 1060.5429 1600 -219.78827 0.011200658 -219.77707 21.663015 -29.13483 1060.5429 1700 -219.78642 0.010006422 -219.77641 19.353263 243.6751 1060.5429 1800 -219.78579 0.010906011 -219.77488 21.093144 490.65047 1060.5429 1900 -219.78573 0.013373885 -219.77236 25.866219 624.00024 1060.5429 2000 -219.78225 0.014287484 -219.76796 27.633197 619.02662 1060.5429 2100 -219.78467 0.022521156 -219.76215 43.557812 442.21135 1060.5429 2200 -219.78438 0.028823738 -219.75555 55.747536 175.23192 1060.5429 2300 -219.78501 0.034010184 -219.751 65.778558 -188.90178 1060.5429 2400 -219.77587 0.028191148 -219.74768 54.524054 -557.11735 1060.5429 2500 -219.77369 0.027810336 -219.74588 53.78753 -781.51594 1060.5429 2600 -219.78217 0.036716367 -219.74545 71.012545 -762.48742 1060.5429 2700 -219.77923 0.034033695 -219.7452 65.82403 -522.27277 1060.5429 2800 -219.77753 0.031754385 -219.74577 61.415653 -232.12544 1060.5429 2900 -219.77919 0.032601519 -219.74659 63.054081 61.069617 1060.5429 3000 -219.78067 0.033007232 -219.74766 63.838765 391.91055 1060.5429 3100 -219.77437 0.025678312 -219.7487 49.664016 701.85643 1060.5429 3200 -219.77692 0.026349764 -219.75057 50.962661 844.2123 1060.5429 3300 -219.78329 0.030588548 -219.7527 59.160827 662.97194 1060.5429 3400 -219.78856 0.033237528 -219.75532 64.284177 293.15421 1060.5429 3500 -219.78095 0.02378893 -219.75716 46.009793 -83.926909 1060.5429 3600 -219.7769 0.017962176 -219.75893 34.740361 -406.01025 1060.5429 3700 -219.77813 0.017584279 -219.76054 34.009475 -627.44542 1060.5429 3800 -219.77929 0.017350478 -219.76194 33.557284 -715.87346 1060.5429 3900 -219.78244 0.019080186 -219.76336 36.902686 -603.04337 1060.5429 4000 -219.78846 0.023776304 -219.76468 45.985373 -327.24592 1060.5429 4100 -219.792 0.026501155 -219.7655 51.255464 17.488112 1060.5429 4200 -219.7892 0.023318115 -219.76589 45.099196 308.73311 1060.5429 4300 -219.77882 0.01343071 -219.76539 25.976124 550.0542 1060.5429 4400 -219.77641 0.011723519 -219.76468 22.674272 690.40336 1060.5429 4500 -219.77881 0.015309962 -219.7635 29.610755 639.09389 1060.5429 4600 -219.78066 0.019182412 -219.76147 37.1004 395.64924 1060.5429 4700 -219.78544 0.027123835 -219.75832 52.459779 24.35561 1060.5429 4800 -219.78352 0.030550771 -219.75297 59.087763 -347.77022 1060.5429 4900 -219.78209 0.035013865 -219.74708 67.719761 -675.27941 1060.5429 5000 -219.76129 0.021421171 -219.73987 41.430347 -841.05433 1060.5429 Loop time of 18.2714 on 1 procs for 5000 steps with 5 atoms Performance: 23.643 ns/day, 1.015 hours/ns, 273.651 timesteps/s 170.9% CPU use with 1 MPI tasks x 1 OpenMP threads MPI task timing breakdown: Section | min time | avg time | max time |%varavg| %total --------------------------------------------------------------- Pair | 18.203 | 18.203 | 18.203 | 0.0 | 99.63 Neigh | 0.0076709 | 0.0076709 | 0.0076709 | 0.0 | 0.04 Comm | 0.017699 | 0.017699 | 0.017699 | 0.0 | 0.10 Output | 0.0048833 | 0.0048833 | 0.0048833 | 0.0 | 0.03 Modify | 0.029036 | 0.029036 | 0.029036 | 0.0 | 0.16 Other | | 0.00877 | | | 0.05 Nlocal: 5 ave 5 max 5 min Histogram: 1 0 0 0 0 0 0 0 0 0 Nghost: 130 ave 130 max 130 min Histogram: 1 0 0 0 0 0 0 0 0 0 Neighs: 0 ave 0 max 0 min Histogram: 1 0 0 0 0 0 0 0 0 0 FullNghs: 20 ave 20 max 20 min Histogram: 1 0 0 0 0 0 0 0 0 0 Total # of neighbors = 20 Ave neighs/atom = 4 Neighbor list builds = 500 Dangerous builds not checked Total wall time: 0:00:19
This is a "Deep Potential Molecular Dynamics" (DeePMD-kit) quick start guide, which allows you to quickly understand the paradigm cycle of DeePMD-kit and apply it to your project.
The accompanying video report for this article can be found in the link below. Due to the settings of the video website, the display quality on this webpage may not be optimal, so it may be necessary to visit the original website for a clearer viewing experience.
Reference
Han Wang, Linfeng Zhang, Jiequn Han, and Weinan E. DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics. Comput. Phys. Comm., 228:178–184, 2018. doi:10.1016/j.cpc.2018.03.016.
Jinzhe Zeng, Duo Zhang, Denghui Lu, Pinghui Mo, Zeyu Li, Yixiao Chen, Marián Rynik, Li'ang Huang, Ziyao Li, Shaochen Shi, Yingze Wang, Haotian Ye, Ping Tuo, Jiabin Yang, Ye Ding, Yifan Li, Davide Tisi, Qiyu Zeng, Han Bao, Yu Xia, Jiameng Huang, Koki Muraoka, Yibo Wang, Junhan Chang, Fengbo Yuan, Sigbjørn Løland Bore, Chun Cai, Yinnian Lin, Bo Wang, Jiayan Xu, Jia-Xin Zhu, Chenxing Luo, Yuzhi Zhang, Rhys E. A. Goodall, Wenshuo Liang, Anurag Kumar Singh, Sikai Yao, Jingchao Zhang, Renata Wentzcovitch, Jiequn Han, Jie Liu, Weile Jia, Darrin M. York, Weinan E, Roberto Car, Linfeng Zhang, and Han Wang. DeePMD-kit v2: A software package for Deep Potential models. 2023. doi:10.48550/arXiv.2304.09409.
https://docs.deepmodeling.com/projects/deepmd/en/master/index.html
--------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 1 ----> 1 import matplotlib ModuleNotFoundError: No module named 'matplotlib'
--------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) Cell In[2], line 2 1 import numpy as np ----> 2 import matplotlib.pyplot as plt 3 import pandas as pd 5 with open("./DeePMD-kit_Tutorial/01.train/lcurve.out") as f: ModuleNotFoundError: No module named 'matplotlib'
matplotlib 3.9.1
3.9.1
/opt/mamba/bin/python3.10