Two-Stage Distance Feature-based Optimization Algorithm for De Novo Protein Structure Prediction.

Gui-Jun Zhang,Xiao-Qi Wang,Lai-Fa Ma,Liu-Jing Wang,Jun Hu,Xiao-Gen Zhou
DOI: https://doi.org/10.1109/tcbb.2019.2917452
2020-01-01
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Abstract:De novo protein structure prediction can be treated as a conformational space optimization problem under the guidance of an energy function. However, it is a challenge of how to design an accurate energy function which ensures low-energy conformations close to native structures. Fortunately, recent studies have shown that the accuracy of de novo protein structure prediction can be significantly improved by integrating the residue-residue distance information. In this paper, a two-stage distance feature-based optimization algorithm (TDFO) for de novo protein structure prediction is proposed within the framework of evolutionary algorithm. In TDFO, a similarity model is first designed by using feature information which is extracted from distance profiles by bisecting K-means algorithm. The similarity model-based selection strategy is then developed to guide conformation search, and thus improve the quality of the predicted models. Moreover, global and local mutation strategies are designed, and a state estimation strategy is also proposed to strike a trade-off between the exploration and exploitation of the search space. Experimental results of 35 benchmark proteins show that the proposed TDFO can improve prediction accuracy for a large portion of test proteins.
What problem does this paper attempt to address?