Prediction of the Secondary Structure Contents of Globular Proteins Based on Three Structural Classes
Chun-Ting Zhang,Ziding Zhang,Zhimin He
DOI: https://doi.org/10.1023/A:1022588803017
1998-01-01
Journal of Protein Chemistry
Abstract:The prediction of the secondary structural contents (those of α-helix and β-strand) of a globular protein is of great use in the prediction of protein structure. In this paper, a new prediction algorithm has been proposed based on Chou's database [Chou (1995), Proteins 21, 319–344]. The new algorithm is an improved multiple linear regression method, taking into account the nonlinear and coupling terms of the frequencies of different amino acids and the length of the protein. The prediction is also based on the structural classes of proteins, but instead of four classes, only three classes are considered, the α class, β class, and the mixed α+β and α/β class or simply the αβ class. Thus the ambiguity that usually occurs between α+β proteins and α/β proteins is eliminated. A resubstitution examination for the algorithm shows that the average absolute errors are 0.040 and 0.035 for the prediction of α-helix content and β-strand content, respectively. An examination of cross-validation, the jackknife analysis, shows that the average absolute errors are 0.051 and 0.045 for the prediction of α-helix content and β-strand content, respectively. Both examinations indicate the self-consistency and the extrapolating effectiveness of the new algorithm. Compared with other methods, ours has the merits of simplicity and convenience for use, as well as high prediction accuracy. By incorporating the prediction of the structural classes, the only input of our method is the amino acid composition and the length of the protein to be predicted.