CGLFold: a contact-assisted <i>de novo</i> protein structure prediction using global exploration and loop perturbation sampling algorithm

Jun Liu,Xiao-Gen Zhou,Yang Zhang,Gui-Jun Zhang
DOI: https://doi.org/10.1093/bioinformatics/btz943
IF: 5.8
2020-01-01
Bioinformatics
Abstract:Motivation: Regions that connect secondary structure elements in a protein are known as loops, whose slight change will produce dramatic effect on the entire topology. This study investigates whether the accuracy of protein structure prediction can be improved using a loop-specific sampling strategy. Results: A novel de novo protein structure prediction method that combines global exploration and loop perturbation is proposed in this study. In the global exploration phase, the fragment recombination and assembly are used to explore the massive conformational space and generate native-like topology. In the loop perturbation phase, a loop-specific local perturbation model is designed to improve the accuracy of the conformation and is solved by differential evolution algorithm. These two phases enable a cooperation between global exploration and local exploitation. The filtered contact information is used to construct the conformation selection model for guiding the sampling. The proposed CGLFold is tested on 145 benchmark proteins, 14 free modeling (FM) targets of CASP13 and 29 FM targets of CASP12. The experimental results show that the loop-specific local perturbation can increase the structure diversity and success rate of conformational update and gradually improve conformation accuracy. CGLFold obtains template modeling score >= 0.5 models on 95 standard test proteins, 7 FM targets of CASP13 and 9 FM targets of CASP12.
What problem does this paper attempt to address?