Data-driven Computation of Molecular Reaction Coordinates

Andreas Bittracher,Ralf Banisch,Christof Schütte
DOI: https://doi.org/10.1063/1.5035183
2018-10-19
Abstract:The identification of meaningful reaction coordinates plays a key role in the study of complex molecular systems whose essential dynamics is characterized by rare or slow transition events. In a recent publication, precise defining characteristics of such reaction coordinates were identified and linked to the existence of a so-called transition manifold. This theory gives rise to a novel numerical method for the pointwise computation of reaction coordinates that relies on short parallel MD simulations only, but yields accurate approximation of the long time behavior of the system under consideration. This article presents an extension of the method towards practical applicability in computational chemistry. It links the newly defined reaction coordinates to concepts from transition path theory and Markov state model building. The main result is an alternative computational scheme that allows for a global computation of reaction coordinates based on commonly available types of simulation data, such as single long molecular trajectories, or the push-forward of arbitrary canonically-distributed point clouds. It is based on a Galerkin approximation of the transition manifold reaction coordinates, that can be tuned to individual requirements by the choice of the Galerkin ansatz functions. Moreover, we propose a ready-to-implement variant of the new scheme, that computes data-fitted, mesh-free ansatz functions directly from the available simulation data. The efficacy of the new method is demonstrated on a small protein system.
Computational Physics,Dynamical Systems,Chemical Physics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to identify meaningful reaction coordinates in complex molecular systems. The essential dynamic characteristics of these systems are usually defined by rare or slow transition events. Although traditional kinetic models such as Markov State Models (MSMs) simplify the understanding of system behavior, they lose information about the transition process and its dynamic characteristics. Therefore, this paper proposes a new numerical method for calculating reaction coordinates in a data - driven manner. This method can preserve the long - term behavior characteristics of the system and is suitable for practical computational chemistry applications. Specifically, the main contributions of the paper include: 1. **Proposing a new computational scheme**: This scheme allows the calculation of global reaction coordinates based on common simulation data types (such as a single long - time molecular trajectory or the forward derivation of an arbitrary regular distribution point cloud). 2. **Calculating reaction coordinates based on transition manifolds**: Approximate the reaction coordinates on the transition manifolds by the Galerkin approximation method. This method can be adjusted to meet specific requirements by choosing different Galerkin basis functions. 3. **Proposing an easily implementable algorithm variant**: This variant directly calculates data - fitted, grid - free basis functions from the available simulation data, improving the practicality and flexibility of the method. 4. **Verifying the effectiveness of the new method**: Demonstrating the effectiveness of the new method on a small protein system, proving its potential in practical applications. In summary, this paper aims to solve the limitations of existing methods in calculating reaction coordinates by developing a new, data - driven computational method, so as to better understand and describe the dynamic behavior of complex molecular systems.