Clustering Multi-Domain Protein Structures in the Essential Dynamics Subspace

Bin Wen,Yunyu Shi,Zhiyong Zhang
DOI: https://doi.org/10.1142/s0219633613410083
2013-01-01
Abstract:A multi-domain protein is able to exist as equilibrium of different conformations in solution, which may be critical to its biological function. Besides experimental techniques, computational methods like molecular dynamics (MD) simulations are suitable to study inter-domain motions of the protein and sample different conformational states. A MD simulation usually generates a trajectory containing large amount of protein structures, and a post-processing cluster analysis would be necessary to group similar structures into clusters and identify these typical conformations of the multi-domain protein. In this paper, the widely used k-means clustering algorithm is implemented in the protein essential dynamics (ED) subspace defined by principal component analysis on the MD trajectory. Cluster analysis of the formin binding protein 21 (FBP21) tandem WW domains demonstrate that the k-means clustering results by measuring distances between structures in the ED subspace are superior to those by using other metrics like pairwise inter-domain residue distances.
What problem does this paper attempt to address?