Prediction of Protein Structural Class Using PSI-BLAST Profile Based Collocation of Amino Acid Pairs

Ke Chen,Kurgan, L.,Jishou Ruan
DOI: https://doi.org/10.1109/icbbe.2007.8
2007-01-01
Abstract:Knowledge of structural classes is useful in understanding folding patterns in proteins. Numerous structural class prediction methods were proposed in the past. Although virtually all state-of-the-art classifiers were already tried, many of these methods use very simple protein sequence representation that often includes amino acid (AA) composition. To this end, we propose a novel sequence representation, which is based on PSI-BLAST profile based collocation of AA pairs. We used two benchmark datasets constructed by Zhou (J. of Prot Chem. 1998, 17(8):729-38) to test the proposed representation with five representative classifiers. The two best classifiers, which include a support vector machine and an instance base learner, achieved 88% and 96% accuracy on the two datasets, respectively. Our results were compared with five recently proposed methods. The comparison shows superiority of the proposed method, which reduces the error rates by 30% and 21% on the two datasets when compared with the best-performing ensemble of boosted logistic regression classifier. Finally, the new sequence representation is compared with AA composition when using support vector machine classifier. The error rate reduction due to application of the new representation equals 40% and 25% for the two datasets, respectively. In short, the PSI- BLAST profile based collocation of AA pairs is shown to be a promising feature-based sequence representation.
What problem does this paper attempt to address?