Predicting protein secondary structure and solvent accessibility with an improved multiple linear regression method.

Sanbo Qin,Yun He,Xian-Ming Pan
DOI: https://doi.org/10.1002/prot.20645
2005-01-01
Abstract:We have improved the multiple linear regression (MLR) algorithm for protein secondary structure prediction by combining it with the evolutionary information provided by multiple sequence alignment of PSI-BLAST. On the CB513 dataset, the three states average overall per-residue accuracy, Q(3), reached 76.4%, while segment overlap accuracy, SOV99, reached 73.2%, using a rigorous jackknife procedure and the strictest reduction of eight states DSSP definition to three states. This represents an improvement of approximately 5% on overall per-residue accuracy compared with previous work. The relative solvent accessibility prediction also benefited from this combination of methods. The system achieved 77.7% average jackknifed accuracy for two states prediction based on a 25% relative solvent accessibility mode, with a Mathews' correlation coefficient of 0.548. The improved MLR secondary structure and relative solvent accessibility prediction server is available at http://spg.biosci. tsinghua.edu.cn/. Proteins 2005;61:473-480. (c) 2005 Wiley-Liss, Inc.
What problem does this paper attempt to address?