空间站广场

论文

Notebooks

比赛

课程

Apps

我的主页

我的Notebooks

我的论文库

我的足迹

我的工作空间

任务

节点

文件

数据集

镜像

项目

数据库

公开

Effects of NVT/NPT Simulation Settings on Diffusion Coefficient Calculation

English

分子动力学

OpenMM

动力学

English分子动力学OpenMM动力学

Yanze Wang

发布于 2023-10-31

推荐镜像 :Third-party software:dmff:0.2.0-notebook

推荐机型 :c2_m4_cpu

数据集

Water models(v1)

This notebook was adopted from an independent course final projects (MIT 10.437/10.637/5.697/5.698 Computational Chemistry 2023).
In theory, only NVE simulations can guarantee a strict diffusion coefficient calculation, while this notebook seeks to know how NVT/NPT influence the calcultion of diffusion coefficients.
You can modify the code below to accomplish NVE simulations by OpenMM
For motivation and theory part, one may refer to : 拿分子动力学算个扩散系数，怎么这么多讲究？
如果你想阅读中文，请点击右侧的翻译按钮翻译成中文。

代码

文本

1. Introduction

Diffusion coefficient is an import transport property. One can easily obtain the diffusion coefficient from the Green-Kuo equation as1:

$D = \frac{1}{3 N} i = 1 \sum N \int_{0}^{\infty} ⟨ v_{i} (0) v_{i} (t) ⟩ d t$

where $N$ is the number of particles, $v_{i}$ is the velocity of the $i$ ’th particle. However, the autocorrelation function of particle’s velocity converges very slowly and would be affected largely by the finite size effect. A more practical expression is the Einstein relation:

$D = \frac{1}{6} i = 1 \sum N t \to \infty lim \frac{d}{d t} ⟨ ∣ r_{i} (t) - r_{i} (0) ∣^{2} ⟩ = \frac{1}{6} t \to \infty lim \frac{MS D}{t}$

and the mean square displacement (MSD) is defined as: $MS D (t) = ⟨ ∣ r_{i} (t) - r_{i} (0) ∣^{2} ⟩$ In practice, the Einstein formula gains better numerical stability and becomes more popular. Nevertheless, calculating diffusion coefficient is not a trivial problem. Many parameters and setting in molecular dynamics simulation may influence the final results and lead to unpleasant artifacts. In this project, we will go through several aspects those can affect the diffusion coefficient calculations, including the choice of ensemble, finite size effects (of cubic boxes) and the thermostat effects. Careful tuning simulation parameters is critical to satisfying results and this project provides practical guidance about how to effectively setup molecular dynamics simulations for transport properties.

代码

文本

2. Theory

Finite size effect on diffusion coefficient calculation

MD programs can only simulate finite systems in practice due to the finite computational resources. To mimic the bulk property in the real world and eliminate the surface effect, the periodical boundary condition (PBC) is widely used, where the copies (or images) of the simulated box is periodically placed together forming an infinite system. However, due to the long-range nature of electrostatic interactions, the thermodynamic and transport properties may suffer from the PBC artifacts as the molecules can actually see their images in neighboring cells². The artifacts become severe especially when the box size approaching to the nonbonded interaction cutoff when using Particle Mesh Ewald (PME)³ methods for electrostatic interactions.

Dünweg and Kremer⁴ firstly found the computed diffusion coefficient is proportional to N^(1/3). Yeh and Hummer⁵ finally derived an analytical relation between computed diffusion coefficients and the inverse box size, known as the Yeh-Hummer (YH) relation: $D_{se l f}^{\infty} = D_{se l f}^{M D} + (k_{B} T ξ) /6 π η L$

where $D_{se l f}^{M D}$ is the computed self-diffusion coefficient at a given box size; $D_{se l f}^{\infty}$ is the corrected diffusion coefficient for an infinite box; $k_{B}$ is the Boltzmann constant; $T$ is the temperature; $ξ$ is a dimensionless constant equal to $2.837$ for cubic lattices (This value was measured under NVE ensemble); $η$ is the shear viscosity computed from the simulation; $L$ is the box length of cubic boxes. From the YH relation, we would expect the computed diffusion coefficient would generally become larger in a larger box and the extrapolated $D_{se l f}^{\infty}$ gives the greatest value. We also would expect that when the box length L is large enough, the computed diffusion coefficient $D_{se l f}^{M D}$ may converge slowly and be close to $D_{se l f}^{\infty}$ .

However, YH relation was proposed at a time when only a few thousands of water molecules were affordable at large. As we have faster and better computational resources, this effect may be less significant. In the Result Section, we will examine the YH relation for a wide range of numbers of water molecules, and numerically show how $D_{se l f}^{M D}$ change when the box is sufficiently large.

代码

文本

Effect of the Barostat

Different thermodynamic ensembles have different simulation techniques. A barostat is exerted on the system to maintain a constant pressure under NPT ensemble simulations. However, barostats usually control pressure by scaling the coordinates of particles globally. The scaling technique can drastically affect the calculation of MSD and thus affect diffusion coefficient results. Monte Carlo Barostat⁶ is widely used for NPT ensemble MD simulations, which uses MD/MC scheme to sample volume fluctuations and control system pressure. For a NPT ensemble, the thermodynamic probability is expressed as: $P (p, q, V) \propto e^{(- (K + U + p_{0} V) / (k_{B} T))}$

where $K$ is the kinetic energy; $U$ is the potential energy; $p_{0}$ is the external pressure and $V$ is the volume. After each regular MD step, a trial MC step $Δ V_{r an d o m}$ is proposed to change the system volume by scaling the coordinates as:

$l_{i}^{'} = l_{i} (\frac{V _{n e w}}{V _{o l d}})^{1/3}$

$r_{i}^{'} = (r_{i} - c_{i}) (\frac{V _{n e w}}{V _{o l d}})^{1/3} + c_{i}$

$l$ is the box length, $r_{i}$ is the coordinates and $c_{i}$ denotes the coordinates of the center of the periodic box. The Metropolis weight is then calculated by:

$Δ W = Δ U + p_{0} Δ V - N β^{- 1} ln \frac{V _{n e w}}{V _{o l d}}$

where $p_{0}$ is the desired pressure, $β = 1/ (k_{B} T)$ . The trial move will be conditionally accepted with probability $e^{Δ W}$ if $Δ W > 0$ and always accepted otherwise.

When scaling the coordinates by $r_{i}^{'} = (r_{i} - c_{i}) (V_{n e w} / V_{o l d})^{\frac{1}{3}} + c_{i}$ , the MSD would have an error approximately to $(1 - (V_{n e w} / V_{o l d})^{\frac{1}{3}}) (c_{i} - r_{i})$ . Therefore, we would expect the results from NPT ensemble are significantly different from the NVT/NVE ensemble. When the box size becomes larger, $(1 - (V_{n e w} / V_{o l d})^{1/3})$ becomes larger too given a fixed $Δ V$ . So, we also would expect that finite size effect becomes significant in NPT simulations.

代码

文本

Effect of the thermostats

Despite most papers have reported their results under NVT ensembles, the parameters of thermostats they use can also significantly influence the diffusion coefficient results. In NVT ensemble, the temperature is calculated as:

$T = \frac{2 K}{3 N k _{B}} = \frac{1}{3 N k _{B}} i \sum m v_{i}^{2}$

Essentially, maintaining temperature means controlling the kinetic energy of particles. Thermostats are a bunch of algorithms helping NVT ensemble simulations be within reasonable temperature fluctuations by tuning the velocities of particles. They are effectively correlating the system to a heat bath. Two main categories exist in thermostats: the stochastic algorithm, e.g., the Andersen thermostat , the family of Langevin Thermostats⁸ , etc.; and the scaling-based algorithm, e.g., the Nosé-Hoover (Chain) Thermostat^9,10 , the V-rescaling thermostat¹¹ , etc.

The stochastic algorithms use stochastic collisions on particles to mimic the heat conductivity. In the Andersen thermostat scheme, during each regular MD step, a set of particles are selected whose velocities will be resampled from the velocity Boltzmann distribution. The selection probability is tuned by the parameter called “collision frequency”. When the velocities of particles are changed/resampled, the history of kinetic information is totally discarded. As expected, when the collision frequency is high, the computed diffusion coefficient is low. In a limiting situation, if all particles get collision at every step, no kinetic information is kept and the results will be similar to that from a random walk.

For the theory of thermostats in detail, you may refer to the following notebooks:

代码

文本

Effect of water constraints

Hydrodynamics is important for property calculation. However, covalent hydrogen bonds typically have very high vibrational frequency, suffering from high frequency quantum effect and numerical instability. To overcome this problem, constraint techniques were proposed. The bond length and bond angle are constrained at fixed values based on algorithms during simulations. Generally used constraint algorithms include SHAKE^12,13 , SETTLE¹⁴ , RATTLE¹⁵ etc.

The flexibility of water molecules can largely influence the diffusion coefficient calculation. Some papers¹⁶ reported that diffusion coefficients were very sensitive to the bond length of water molecules. However, due to the high frequency nature of the covalent hydrogen bond, one has to use smaller time step (at least no greater than 1 fs) for numerical stability. This leads to significant longer simulation time in real situations.

代码

文本

Simulations

In this section, we will use OpenMM 8.0 to perform numerical experiments. GPU computational nodes are recommended.

A code template is provided below.

代码

文本

In this project, we choose TIP3P water model as the example model, parameterized by Li et al in 2015 ¹⁷ . This water model is currently the standard water model in AmberTools18 tleap program.

The well-equilibrated water models are provided under /bohr/WaterTip3pBoxes-fmf8/v1/models

All simulations are performed via OpenMM 8.0¹⁹ . Temperature was set to 300 K. The periodical boundary condition was applied for all simulations. Nonbonded interactions are calculated under the Particle Mesh Ewald (PME) scheme, where the nonbonded cutoff is set to 1 nanometer.

代码

文本

[10]

ls /bohr/WaterTip3pBoxes-fmf8/v1/models

tip3p_10.0.prmtop*               tip3p_17.prmtop*
tip3p_10.0_equilibrated.inpcrd*  tip3p_17_equilibrated.inpcrd*
tip3p_11.0.prmtop*               tip3p_18.prmtop*
tip3p_11.0_equilibrated.inpcrd*  tip3p_18_equilibrated.inpcrd*
tip3p_12.0.prmtop*               tip3p_19.prmtop*
tip3p_12.0_equilibrated.inpcrd*  tip3p_19_equilibrated.inpcrd*
tip3p_13.0.prmtop*               tip3p_20.prmtop*
tip3p_13.0_equilibrated.inpcrd*  tip3p_20_equilibrated.inpcrd*
tip3p_14.0.prmtop*               tip3p_25.prmtop*
tip3p_14.0_equilibrated.inpcrd*  tip3p_25_equilibrated.inpcrd*
tip3p_15.0.prmtop*               tip3p_30.prmtop*
tip3p_15.0_equilibrated.inpcrd*  tip3p_30_equilibrated.inpcrd*
tip3p_16.prmtop*                 tip3p_35.prmtop*
tip3p_16_equilibrated.inpcrd*    tip3p_35_equilibrated.inpcrd*

代码

文本

[ ]

import openmm as mm

import openmm.unit as unit

import openmm.app as app

import parmed as pmd

import numpy as np

from pathlib import Path

import os

# print openmm version

print(f"OpenMM version: {mm.__version__}")

代码

文本

[ ]

def simulate_water_box(

prmtop_file,

inpcrd_file,

ensemble='NPT',

n_steps=1000000,

report_invertal=1000,

replicate=1,

integrator_name='LangevinIntegrator',

constraint_type=None,

hydrogenmass=None,

time_interval=2,

collision_rate=1.0,

temperature = 300 * unit.kelvin,

save_finalFrame=False

)：

# Load the Amber-formatted water box

inpcrd = app.AmberInpcrdFile(inpcrd_file)

prmtop = app.AmberPrmtopFile(prmtop_file)

# Set constraints type

if constraint_type == 'HBonds':

constraints = app.HBonds

rigidWater = False

elif constraint_type == 'HAngles':

constraints = app.HAngles

rigidWater = False

elif constraint_type == None or constraint_type == "None":

constraints = None

rigidWater = False

elif constraint_type == "Full":

rigidWater = True

constraints = app.HBonds

else:

raise ValueError(f"Constraint type {constraint_type} is not supported")

system = prmtop.createSystem(nonbondedMethod=app.PME,

nonbondedCutoff=1.0*unit.nanometers,

constraints=constraints,

rigidWater=rigidWater)

# Set up integrator and simulation context

_collision_rate = 1.0 ## for equilibration

time_interval = time_interval * unit.femtoseconds

if integrator_name == 'LangevinIntegrator':

integrator = mm.LangevinIntegrator(300 * unit.kelvin, _collision_rate / unit.picosecond, time_interval)

elif integrator_name == "LangvinMiddleIntegrator":

integrator = mm.LangevinMiddleIntegrator(300 * unit.kelvin, _collision_rate / unit.picosecond, time_interval)

elif integrator_name == 'VerletIntegrator' or "AndersenThermostat" in integrator_name:

system.addForce(mm.AndersenThermostat(300*unit.kelvin, _collision_rate/unit.picosecond))

integrator = mm.VerletIntegrator(time_interval)

elif integrator_name == 'NoseHooverIntegrator':

integrator = mm.NoseHooverIntegrator(300*unit.kelvin, _collision_rate/unit.picosecond, time_interval)

elif integrator_name == None:

integrator = mm.LangevinIntegrator(300 * unit.kelvin, _collision_rate / unit.picosecond, time_interval)

else:

raise ValueError(f"Integrator {integrator_name} is not supported")

platform = mm.Platform.getPlatformByName('CUDA')

barostat = mm.MonteCarloBarostat(1.0 * unit.bar, temperature)

system.addForce(barostat)

simulation = app.Simulation(prmtop.topology, system, integrator, platform)

# Set the initial positions

simulation.context.setPositions(inpcrd.positions)

# Minimize energy

print("Minimizing energy...")

simulation.minimizeEnergy()

if not (integrator_name == 'VerletIntegrator' or "AndersenThermostat" in integrator_name):

integrator.setTemperature(temperature)

simulation.step(20000)

force_dict = {force.__class__.__name__: index for index, force in enumerate(system.getForces())}

# change collision rate

if integrator_name == 'LangevinIntegrator':

integrator.setFriction(collision_rate / unit.picosecond)

elif integrator_name == "LangvinMiddleIntegrator":

integrator.setFriction(collision_rate / unit.picosecond)

elif integrator_name == 'VerletIntegrator' or "AndersenThermostat" in integrator_name:

system.getForce(force_dict['AndersenThermostat']).setDefaultCollisionFrequency(collision_rate / unit.picosecond)

elif integrator_name == "NoseHooverIntegrator":

pass

elif integrator_name == None:

system.removeForce(force_dict['AndersenThermostat'])

else:

raise ValueError(f"Integrator {integrator_name} is not supported")

name_prefix = f"{inpcrd_file.stem}_{ensemble}_{n_steps}_{replicate}_{integrator_name}_{constraint_type}_{collision_rate}"

# remove the barostat, get the barastat force index

if ensemble == 'NVT' or ensemble == 'NVE':

try:

system.getForce(force_dict['MonteCarloBarostat']).setFrequency(0) # This requires openmm 8

except KeyError:

# No MonteCarloBarostat found

print("No MonteCarloBarostat found in forces. Continuing.")

except mm.OpenMMException:

# Must be Openmm<8

system.removeForce(force_dict['MonteCarloBarostat'])

print("NVT Equilibration...")

if ensemble == 'NVE':

try:

system.removeForce(force_dict['AndersenThermostat'])

except KeyError:

integrator = mm.VerletIntegrator(stepSize=time_interval)

elif ensemble == 'NPT' or ensemble is None:

pass

else:

raise ValueError(f"Invalid parameter supplied for {ensemble}")

simulation.context.reinitialize(preserveState=True)

simulation.reporters = []

simulation.reporters.append(app.StateDataReporter(f'{name_prefix}.log', report_invertal, step=True, temperature=True, density=True, volume=True))

simulation.reporters.append(pmd.openmm.NetCDFReporter(f'{name_prefix}.nc', report_invertal))

simulation.step(n_steps)

if save_finalFrame:

# Save the final structure as Amber inpcrd file

final_positions = simulation.context.getState(getPositions=True).getPositions()

final_box_vectors = simulation.context.getState().getPeriodicBoxVectors()

final_structure = pmd.openmm.load_topology(prmtop.topology, xyz=final_positions)

final_structure.box_vectors = final_box_vectors

# Save the final structure to an Amber inpcrd file

final_structure.save(f"{name_prefix}.inpcrd", overwrite=True)

代码

文本

Code template for simulations, please modify as needed

Code templates for different tests are provided below. They may cost a significant amount of time to be finished. We recommend you using computational clusters or submit jobs to Bohrium.

代码

文本

[ ]

# Example: Test the effect of collision rate:

# !!Modify them to fit your test!!

for replicate in range(3):

for integrator in ["LangevinIntegrator", "AndersenThermostat"]:

for collision_rate in [0.1, 1, 5, 10, 50]:

simulate_water_box(

Path(prmtop_file),

Path(inpcrd_file),

ensemble="NVT",

n_steps=2500000,

report_invertal=500,

replicate=replicate,

integrator_name=integrator,

constraint_type="Full",

time_interval=2.0,

collision_rate=collision_rate,

)

代码

文本

Code template for analysis, please modify as needed

After simulations, use Ambertools to get the diffusion coefficients from the trajectories.

A code template is provided below.

代码

文本

[ ]

## This is only for example, Do not run it.

cpp_in_tmpl = """

parm {parm}

trajin {traj} 100 last

diffusion out {msdout} time 1 :WAT@O average diffout {diffout}

"""

model_dir = Path('./tip3p_models')

script_dir = Path("./script")

result_dir = Path("./result")

if not os.path.exists(script_dir):

os.mkdir(script_dir)

replicates = 3

cpp_in_files = []

trajs = list(result_dir.glob('*.nc'))

for traj in trajs:

parm = model_dir / f"tip3p_{traj.stem.split('_')[2]}.prmtop"

cpp_in = script_dir / f"cpp_{traj.stem}.in"

msdout = result_dir / f"msd_{traj.stem}.dat"

diffout = result_dir / f"diff_{traj.stem}.dat"

with open(cpp_in, "w") as f:

f.write(cpp_in_tmpl.format(parm=parm, traj=traj, msdout=msdout, diffout=diffout))

cpp_in_files.append(cpp_in)

def run_cpptraj(cpp_in):

os.system(f"cd {script_dir} && cpptraj -i {cpp_in}")

if __name__ == "__main__":

n_proc = 4

if multiprocess:

# may not work in JupyterNotebook.

import multiprocessing as mp

with mp.Pool(n_proc) as pool:

pool.map(run_cpptraj, cpp_in_files)

else:

for ff in cpp_in_files:

run_cpptraj(ff)

代码

文本

Code template for visualization

代码

文本

[ ]

##############################################################################

# Code Template, Do not run!

##############################################################################

# Your Code goes here to collection results.

##############################################################################

all_diff_data_replicates = {}

for replicate in range(replicates):

result_dir = Path(f"./result")

diff_datas = result_dir.glob('diff_*.dat')

all_diff_data = {}

for diff in diff_datas:

diff_data = np.loadtxt(diff, skiprows=1, usecols=(0, 1))

ensemble = diff.stem.split('_')[1]

box_size = diff.stem.split('_')[3]

if ensemble == "NVT":

steps = diff.stem.split('_')[-4]

elif ensemble == "NPT":

steps = diff.stem.split('_')[-2]

else:

raise ValueError("Unknown ensemble")

if not ensemble in all_diff_data:

all_diff_data[ensemble] = {}

if not box_size in all_diff_data[ensemble]:

all_diff_data[ensemble][box_size] = {}

all_diff_data[ensemble][box_size][steps] = diff_data

all_diff_data_replicates[replicate] = all_diff_data

for steps in all_diff_data_replicates[0]['NVT']['10.0']:

plt.figure()

lines = {}

YH_relation = []

for ensemble in ["NVT"]:

for box_size in all_diff_data_replicates[0][ensemble]:

if ensemble == "NVT":

marker = "o"

color = "b"

elif ensemble == "NPT":

marker = "s"

color = "r"

diffusion_coefficient = []

for replicate in range(replicates):

print(replicate,ensemble,box_size,steps)

diffusion_coefficient.append(all_diff_data_replicates[replicate][ensemble][box_size][steps][0][1])

N_water = water_number[box_size]

real_box = real_box_sizes[box_size][ensemble]

X = 1 / real_box

eb = plt.errorbar(X,

np.mean(diffusion_coefficient),

yerr=np.std(diffusion_coefficient),

fmt=marker, ecolor=color, color=color, capsize=3, capthick=1, elinewidth=1,)

plt.scatter(

[X, X, X], diffusion_coefficient,

marker=marker, color="black", s=10, alpha=0.5

)

lines[ensemble] = eb

YH_relation.append([X, np.mean(diffusion_coefficient)])

plt.xlabel("1 / box length ($nm^{-1}$)")

plt.ylabel("Diffusion Coefficient ($ 10^{-5} cm^{2}/s $)")

plt.title(f"Diffusion Coefficient v.s. inverse Box Size")

# linear fitting

YH_relation = np.array(YH_relation)

res = stats.linregress(YH_relation[:,0], YH_relation[:,1])

print(YH_relation)

print(f"R-squared: {res.rvalue**2:.6f}")

print(f"slope: {res.slope:.6f}")

print(f"intercept: {res.intercept:.6f}")

new_X = np.linspace(0, np.max(YH_relation[:,0]), 100)

plt.plot(new_X, res.intercept + res.slope*new_X, color="red", linestyle="--", alpha=0.6)

plt.scatter(0, res.intercept, marker="x", color="red", s=20, alpha=0.6)

plt.text(0, res.intercept, f"{res.intercept:.4f}", color="red")

plt.savefig(pics_dir / f"diffusion_{steps}.png")

print(pics_dir / f"diffusion_{steps}.png", "saved")

plt.close()

print(water_number)

代码

文本

[1]

from IPython.display import Image

代码

文本

Results

Finite size effects are well captured by Yeh-Hummer relation

To obtain the results of different box sizes, we performed MD simulations of water boxes with different sizes. Since the nonbonded cutoff was set to 1 nanometer, the smallest built initial box length has to be greater than 2 nanometers, otherwise the periodic images will lead to great artifacts.

From the simulation results (shown in Figure 1), we observed that the computed diffusion coefficients become larger at larger box sizes in general. The results basically obey the YH relation with $R^{2}$ coefficients equal to 0.391.

Another observation is that with the increase of the box size, the variance of the computed diffusion coefficients becomes smaller. Simulating larger water boxes is beneficial to get statistically confidential results. It’s worth noting that in the previous work of Yeh and Hummer, they simulated water boxes containing 128, 256, 512, 1024 and 2048 respectively. The number of water molecules in our systems ranges from 371 to 12626, which is significantly broader than that in Yeh’ work. However, even with as many as 12626 water molecules, the calculated diffusion coefficient still has the general trend. The extrapolated $D_{se l f}^{\infty} = 4.98 \times 1 0^{- 5} c m^{2} /s$ , owning the greatest values as well as the largest error compared to the experimental results $2.299 \times 1 0^{- 5} c m^{2} /s$ . However, this values practically eliminate the effect of finite size and therefore reflect the performance of force fields better.

代码

文本

[6]

Image("/bohr/WaterTip3pBoxes-fmf8/v1/diffusion_ensemble_2500000.png", width=600, height=50)

代码

文本

Diffusion coefficients from NPT ensembles are different from NVT simulations

In general, NPT simulation (red) showed similar results to NVT simulations (blue) and obeyed Yeh-Hummer relation5 (shown in Figure 1) with R2 equals to 0.502. The extrapolated $D_{NPT}^{\infty}$ is $5.02 \times 1 0^{- 5} c m^{2} /s$ , larger than $D_{N V T}^{\infty}$ $4.98 \times 1 0^{- 5} c m^{2} /s$ . The fitted slope of NPT results (-0.606) is also larger than that in NVT results (-0.436), indicating NPT simulation was more sensitive to the cubic box size. This was expected because the barostats in NPT simulation scales the particle coordinates globally as we discussed in the Theory Section, although the difference between the two ensembles is not very significant under current simulation setting. Nevertheless, we still recommend not to use NPT ensemble simulations to compute the diffusion coefficient because of those artifacts.

代码

文本

Nosé-Hoover Thermostat significantly increases diffusion coefficients than stochastic schemes

Nosé-Hoover thermostats showed significantly larger diffusion coefficients than all stochastic thermostats (shown in Figure 2). The scaling-based algorithm keeps most history of the kinetic information and makes trajectory more continuous. So, the diffusion coefficient becomes larger given the same interaction interval ( $1 ps^{- 1}$ ) with the heat bath.

Among the stochastic thermostats, Andersen thermostats had the largest diffusion coefficient when the collision frequency was set to $1 ps^{- 1}$ . A possible reason is that, in Andersen scheme, the particles not selected were still keeping NVE trajectories, while in Langevin scheme, all particles were experiencing dissipation and collision and therefore likely lost more kinetic information. From this angle, the NVE ensemble is a better candidate for transport property calculations in principle because no thermostat or barostat disturb the trajectories. From the statistical mechanic theory, when the number of particles is large enough, NVE ensemble, NVT ensemble and NPT ensemble would resemble with each other.

In summary, all stochastic thermostat caused the collapse of the kinetic information. NVE simulation may be desired when system is large enough.

代码

文本

[7]

Image("/bohr/WaterTip3pBoxes-fmf8/v1/diffusion_thermostat.png", width=600, height=50)

代码

文本

Higher collision frequency causes more randomness in simulations

To better understand how the stochastic thermostats affect the calculation of transport properties, we further studied the results calculated from different collision frequencies (shown in Figure 3) using Andersen thermostats and Langevin thermostats. As expected, when the collision frequency becomes larger, the diffusion coefficients decrease due to the loss of kinetic information. When the collision frequency is very high ( $50 ps^{- 1}$ ), the results from the two stochastic algorithms are approaching to each other as they resemble Brownian dynamics gradually.

Although low collision frequency largely keeps kinetic information, it can’t maintain the temperature well with the largest standard deviation on diffusion calculations ( $0.5 \times 1 0^{5} c m^{2} /s$ ). We here recommend $1 ps^{- 1}$ collision frequency for stochastic thermostats. This value has the good capability to maintain temperature as well as gives results close to those in low collision frequency (rel. error 6%).

It is worth noting that the decreased diffusion coefficients by increase collision rate doesn’t mean the reduced error to experimental value $2.299 \times 1 0^{- 5} c m^{2} /s$ . It is just the artifact from the nature of randomness.

代码

文本

[8]

Image("/bohr/WaterTip3pBoxes-fmf8/v1/diffusion_collision.png", width=600, height=50)

代码

文本

Flexible water models are more accurate

No significant difference was found among the constrained water models (shown in Figure 4). However, the flexible water model had a significantly lower diffusion coefficient value and much closer to the experimental results. The flexibility made diffusion of water molecules slower than rigid models. A possible reason is that the flexibility changed the collision behavior between molecules. Flexible water models also have larger both shear and bulk viscosity. The interactions between flexible water molecules are larger than those in rigid models. The water structure and hydrogen bond network also changed because of the flexibility. All these factors slowed down the motion of water and caused a lower diffusion coefficient.

In practical speaking, flexible water models need smaller time step ( $1 fs$ ). The wall time for flexible water model is twice as the rigid water models. Therefore, the choice of constraints is the trade-off between accuracy and speed. When computational resources are enough, not using constraints is recommended.

However, one should note that generally flexible water models have different force field parameters from the rigid models. They are usually separately parameterized. In this article, we only released the constraints while still keeping the TIP3P parameters just to show the effect solely from constraints. In practice, TIP3P is widely used as a rigid water model. If a flexible water model is needed, one may refer to SPC/FW¹⁶, TIP3P/FW¹⁶, etc. to gain higher accuracy.

代码

文本

[9]

Image("/bohr/WaterTip3pBoxes-fmf8/v1/diffusion_constriants.png", width=600, height=50)

代码

文本

Summary

In this study, we examined several molecular dynamics simulation settings those can affect the calculations of diffusion coefficients. We provided the recommended simulation guidance as follows. One should calculate diffusion coefficients under NVE ensemble in principle with sufficiently large simulation box. When NVT ensemble is necessary, one should choose thermostat carefully with recommended $1 p s^{- 1}$ collision frequency. Nosé-Hoover thermostat can get significantly larger computed diffusion coefficient. Avoiding NPT ensemble is necessary. When box size is limited, one can use YH relation extrapolation to get approximated diffusion coefficient values of infinite box size. Water constraints can fasten water diffusion and result in much larger diffusion coefficients. Flexible waters are highly recommended if accuracy matters in the questions of interest.

代码

文本

Reference

(1) Wang, J.; Hou, T. Application of Molecular Dynamics Simulations in Molecular Property Prediction II: Diffusion Coefficient. J. Comput. Chem. 2011, 32 (16), 3505–3519. https://doi.org/10.1002/jcc.21939.

(2) Beglov, D.; Roux, B. Finite Representation of an Infinite Bulk System: Solvent Boundary Potential for Computer Simulations. J. Chem. Phys. 1994, 100 (12), 9050–9063. https://doi.org/10.1063/1.466711.

(3) Darden, T.; York, D.; Pedersen, L. Particle Mesh Ewald: An N ⋅log( N ) Method for Ewald Sums in Large Systems. J. Chem. Phys. 1993, 98 (12), 10089–10092. https://doi.org/10.1063/1.464397.

(4) Dünweg, B.; Kremer, K. Microscopic Verification of Dynamic Scaling in Dilute Polymer Solutions: A Molecular-Dynamics Simulation. Phys. Rev. Lett. 1991, 66 (23), 2996–2999. https://doi.org/10.1103/PhysRevLett.66.2996.

(5) Yeh, I.-C.; Hummer, G. System-Size Dependence of Diffusion Coefficients and Viscosities from Molecular Dynamics Simulations with Periodic Boundary Conditions. J. Phys. Chem. B 2004, 108 (40), 15873–15879. https://doi.org/10.1021/jp0477147.

(6) Åqvist, J.; Wennerström, P.; Nervall, M.; Bjelic, S.; Brandsdal, B. O. Molecular Dynamics Simulations of Water and Biomolecules with a Monte Carlo Constant Pressure Algorithm. Chem. Phys. Lett. 2004, 384 (4–6), 288–294. https://doi.org/10.1016/j.cplett.2003.12.039.

(7) Andersen, H. C. Molecular Dynamics Simulations at Constant Pressure and/or Temperature. J. Chem. Phys. 1980, 72 (4), 2384–2393. https://doi.org/10.1063/1.439486.

(8) Zhang, Z.; Liu, X.; Yan, K.; Tuckerman, M. E.; Liu, J. Unified Efficient Thermostat Scheme for the Canonical Ensemble with Holonomic or Isokinetic Constraints via Molecular Dynamics. J. Phys. Chem. A 2019, 123 (28), 6056–6079. https://doi.org/10.1021/acs.jpca.9b02771.

(9) Martyna, G. J.; Klein, M. L.; Tuckerman, M. Nosé–Hoover Chains: The Canonical Ensemble via Continuous Dynamics. J. Chem. Phys. 1992, 97 (4), 2635–2643. https://doi.org/10.1063/1.463940.

(10) Posch, H. A.; Hoover, W. G.; Vesely, F. J. Canonical Dynamics of the Nosé Oscillator: Stability, Order, and Chaos. Phys. Rev. A 1986, 33 (6), 4253–4265. https://doi.org/10.1103/PhysRevA.33.4253.

(11) Bussi, G.; Donadio, D.; Parrinello, M. Canonical Sampling through Velocity Rescaling. J. Chem. Phys. 2007, 126 (1), 014101. https://doi.org/10.1063/1.2408420.

(12) Elber, R.; Ruymgaart, A. P.; Hess, B. SHAKE Parallelization. Eur. Phys. J. Spec. Top. 2011, 200 (1), 211–223. https://doi.org/10.1140/epjst/e2011-01525-9.

(13) Forester, T. R.; Smith, W. SHAKE, Rattle, and Roll: Efficient Constraint Algorithms for Linked Rigid Bodies. J. Comput. Chem. 1998, 19 (1), 102–111. https://doi.org/10.1002/(SICI)1096-987X(19980115)19:1<102::AID-JCC9>3.0.CO;2-T.

(14) Miyamoto, S.; Kollman, P. A. Settle: An Analytical Version of the SHAKE and RATTLE Algorithm for Rigid Water Models. J. Comput. Chem. 1992, 13 (8), 952–962. https://doi.org/10.1002/jcc.540130805.

(15) Andersen, H. C. Rattle: A “Velocity” Version of the Shake Algorithm for Molecular Dynamics Calculations. J. Comput. Phys. 1983, 52 (1), 24–34. https://doi.org/10.1016/0021-9991(83)90014-1.

(16) Wu, Y.; Tepper, H. L.; Voth, G. A. Flexible Simple Point-Charge Water Model with Improved Liquid-State Properties. J. Chem. Phys. 2006, 124 (2), 024503. https://doi.org/10.1063/1.2136877.

(17) Li, P.; Song, L. F.; Merz, K. M. Systematic Parameterization of Monovalent Ions Employing the Nonbonded Model. J. Chem. Theory Comput. 2015, 11 (4), 1645–1657. https://doi.org/10.1021/ct500918t.

(18) Case, D. A.; Aktulga, H. M.; Belfon, K.; Cerutti, D. S.; Cisneros, G. A.; Cruzeiro, V. W. D.; Forouzesh, N.; Giese, T. J.; Götz, A. W.; Gohlke, H.; Izadi, S.; Kasavajhala, K.; Kaymak, M. C.; King, E.; Kurtzman, T.; Lee, T.-S.; Li, P.; Liu, J.; Luchko, T.; Luo, R.; Manathunga, M.; Machado, M. R.; Nguyen, H. M.; O’Hearn, K. A.; Onufriev, A. V.; Pan, F.; Pantano, S.; Qi, R.; Rahnamoun, A.; Risheh, A.; Schott-Verdugo, S.; Shajan, A.; Swails, J.; Wang, J.; Wei, H.; Wu, X.; Wu, Y.; Zhang, S.; Zhao, S.; Zhu, Q.; Cheatham, T. E.; Roe, D. R.; Roitberg, A.; Simmerling, C.; York, D. M.; Nagan, M. C.; Merz, K. M. AmberTools. J. Chem. Inf. Model. 2023, 63 (20), 6183–6191. https://doi.org/10.1021/acs.jcim.3c01153.

(19) Eastman, P.; Swails, J.; Chodera, J. D.; McGibbon, R. T.; Zhao, Y.; Beauchamp, K. A.; Wang, L.-P.; Simmonett, A. C.; Harrigan, M. P.; Stern, C. D.; Wiewiora, R. P.; Brooks, B. R.; Pande, V. S. OpenMM 7: Rapid Development of High Performance Algorithms for Molecular Dynamics. PLOS Comput. Biol. 2017, 13 (7), e1005659. https://doi.org/10.1371/journal.pcbi.1005659.

(20) Roe, D. R.; Cheatham, T. E. Parallelization of CPPTRAJ Enables Large Scale Analysis of Molecular Dynamics Trajectory Data. J. Comput. Chem. 2018, 39 (25), 2110–2117. https://doi.org/10.1002/jcc.25382.

(21) Roe, D. R.; Cheatham, T. E. PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data. J. Chem. Theory Comput. 2013, 9 (7), 3084–3095. https://doi.org/10.1021/ct400341p.

(22) Case, D. A.; Cheatham, T. E.; Darden, T.; Gohlke, H.; Luo, R.; Merz, K. M.; Onufriev, A.; Simmerling, C.; Wang, B.; Woods, R. J. The Amber Biomolecular Simulation Programs. J. Comput. Chem. 2005, 26 (16), 1668–1688. https://doi.org/10.1002/jcc.20290.

(23) Reuther, A.; Kepner, J.; Byun, C.; Samsi, S.; Arcand, W.; Bestor, D.; Bergeron, B.; Gadepally, V.; Houle, M.; Hubbell, M.; Jones, M.; Klein, A.; Milechin, L.; Mullen, J.; Prout, A.; Rosa, A.; Yee, C.; Michaleas, P. Interactive Supercomputing on 40,000 Cores for Machine Learning and Data Analysis. In 2018 IEEE High Performance extreme Computing Conference (HPEC); IEEE: Waltham, MA, 2018; pp 1–6. https://doi.org/10.1109/HPEC.2018.8547629.

代码

文本

English

分子动力学

OpenMM

动力学

English分子动力学OpenMM动力学

已赞2

本文被以下合集收录

bohr61096f

更新于 2024-08-27

47 篇0 人关注

DPA2相关-自用

张星辰

更新于 2024-07-16

7 篇0 人关注