Protein Secondary Structure Prediction Using Support Vector Machine with a PSSM Profile and an Advanced Tertiary Classifier

Hae-Jin Hu,Phang C. Tai,Robert W. Harrison,Jieyue He,Yi Pan
DOI: https://doi.org/10.1109/csbw.2005.114
2005-01-01
Abstract:In this study, the support vector machine (SVM) is applied as a learning machine for the secondary structure prediction. As an encoding scheme for training the SVM, position-specific scoring matrix (PSSM) is adopted. To improve the prediction accuracy, three optimization processes such as encoding scheme, sliding window size and parameter optimization are performed. For the multi-class classification, the results of three one-versus-one binary classifiers (H/E, E/C and C/H) are combined using our new tertiary classifier called SVM/spl I.bar/Represent. By applying this new tertiary classifier, the Q/sub 3/ prediction accuracy reaches 89.6% on the RSI 26 dataset and 90.1% on the CB513 dataset. Also the Segment Overlap Measure (SOV) is 85.0% on the RS 126 dataset and 85.7% on the CB513 dataset. Compared with the existing best prediction methods, our new prediction algorithm improves the accuracy about 13%) in terms of Q/sub 3/ and SOV, the two most commonly used accuracy measures.
What problem does this paper attempt to address?