Tutorials on How to Build Non-Markovian Dynamic Models from Molecular Dynamics Simulations for Studying Protein Dynamics

Xuhui Huang,Yue Wu,Siqin Cao,Yunrui Qiu
DOI: https://doi.org/10.26434/chemrxiv-2023-kvsvl
2023-11-28
Abstract:Protein conformational changes play crucial roles in their biological functions. In recent years, the Markov State Model (MSM) constructed from extensive Molecular Dynamics (MD) simulations has emerged as a powerful tool for modeling complex protein conformational changes. In MSMs, dynamics are modeled as a sequence of Markovian transitions among metastable conformational states at discrete time intervals (called lag time). A major challenge for MSM is that the lag time must be long enough to allow transitions among states to become memoryless (or Markovian). However, this lag time is constrained by the length of individual MD simulations available to track these transitions. To address this challenge, we have recently developed Generalized Master Equation (GME)-based approaches, encoding non-Markovian dynamics using a time-dependent memory kernel. In this tutorial, we introduce the theory behind two recently developed GME-based non-Markovian dynamic models: the Quasi Markov State Model (qMSM) and the Integrative Generalized Master Equation (IGME). We subsequently outline the procedures for constructing these models and provide a step-by-step tutorial on applying qMSM and IGME to study two peptide systems: the alanine dipeptide and villin headpiece. This tutorial is available at https://github.com/xuhuihuang/GME_tutorials. The protocols detailed in this tutorial aim to be accessible for non-experts interested in studying the biomolecular dynamics using these non-Markovian dynamic models.
Chemistry
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the time - scale problem in protein dynamics modeling. Specifically, when simulating complex protein conformational changes, the traditional Markov State Model (MSM) requires a long lag time to ensure that the transitions between states are memory - less (i.e., Markov processes). However, due to the length limitations of available molecular dynamics (MD) simulations, such a long lag time is difficult to achieve. This has led to a major challenge in MSM construction, especially when studying protein dynamics, because the lag time is limited by the length of a single MD simulation. To address this challenge, the authors developed a method based on the Generalized Master Equation (GME), in particular introducing two non - Markovian dynamic models: the Quasi Markov State Model (qMSM) and the Integrative Generalized Master Equation (IGME). These methods can accurately predict long - time - scale dynamic behaviors based on shorter MD simulation data by using time - dependent memory kernels to encode non - Markovian dynamics. ### Key points of the solution: 1. **Non - Markovian dynamics**: By introducing time - dependent memory kernels, the GME method can capture the memory effects in state transitions, thereby achieving accurate prediction of long - time - scale dynamics within a shorter lag time. 2. **Reducing numerical instability**: The IGME method avoids the numerical instability caused by directly calculating time - dependent memory kernels by analytically solving the solution of GME under the condition that the memory kernel has completely decayed, improving the robustness of the model. 3. **Simplifying model construction**: Compared with the traditional MSM, the GME method can accurately describe protein dynamics with a smaller number of states, thus simplifying the construction and interpretation of the model. ### Application examples: The paper provides two specific examples to show how to apply the qMSM and IGME models to study the dynamics of peptide systems, namely alanine dipeptide and villin headpiece. These examples not only demonstrate the effectiveness of the methods but also provide detailed steps and Python code for researchers to learn and apply. In conclusion, this paper aims to solve the time - scale limitation problem of traditional MSM in protein dynamics modeling by introducing non - Markovian dynamic models under the GME framework, providing a more efficient and accurate research tool.