A Fast Approach to Protein Structure Alignment
Hong Fang,Junping Xiang,Maolin Hu
DOI: https://doi.org/10.1166/jctn.2007.2426
2007-01-01
Journal of Computational and Theoretical Nanoscience
Abstract:The problem of aligning, or establishing, a correspondence between residues of two protein structures is fundamental in computational structure biology. With a rapidly growing pool of known tertiary structures, the importance of protein structure comparison parallel and surpass that of sequence alignment, since protein structure is more conserved than sequence alignment. But updating structural alignment is typically based on the Euclidean distance between corresponding residues (C-alpha atom), and because the corresponding points are not known, the algorithms have to search the whole residues along the main chain of the proteins. This means all the algorithms are heuristics and NP hard. In this paper, a novel protein structure alignment algorithm for optimal pairwise alignment is provided. The method is based on comparing the invariants of three-dimensional structure instead of atom distance and the corresponding atoms are established through the invariants, which are, for example, the volumes or angles of four atoms along the residues on the backbone. As it is well known, the volume of an object is independent of the coordinate system, so it is same under any rotation or translation. If two structures have the same conformation, they should have the same volume; and if the invariants are different, they must have different structures. By comparing the volume series along the main chain, which can be realized through dynamical programming, the local similarity can be found, and the protein structures are aligned according to these local similarities. The algorithm is intrinsically local-versus-local, and can be extended to all-against-all alignment. Our method transforms alignment of three-dimensional data into the comparison of two one-dimensional series, so the structure alignment can be solved within polynomial time by dynamical programming methods. It also searches the local similarity instead of comparing the whole protein, which are more suitable in finding the conserved residues and protein folds than the current alignment programs. The algorithm can also take the side chain into consideration. And the computation time is greatly reduced compared to current structure alignment programs, so it can be implemented online. We have compared our algorithm with the currently popular protein structure alignment methods on two different kinds of data, the results from which show the superior performance of our algorithm.