Prediction of Protein Structural Class Based on Gapped-Dipeptides and a Recursive Feature Selection Approach

Taigang Liu,Yufang Qin,Yongjie Wang,Chunhua Wang
DOI: https://doi.org/10.3390/ijms17010015
IF: 5.6
2015-12-24
International Journal of Molecular Sciences
Abstract:The prior knowledge of protein structural class may offer useful clues on understanding its functionality as well as its tertiary structure. Though various significant efforts have been made to find a fast and effective computational approach to address this problem, it is still a challenging topic in the field of bioinformatics. The position-specific score matrix (PSSM) profile has been shown to provide a useful source of information for improving the prediction performance of protein structural class. However, this information has not been adequately explored. To this end, in this study, we present a feature extraction technique which is based on gapped-dipeptides composition computed directly from PSSM. Then, a careful feature selection technique is performed based on support vector machine-recursive feature elimination (SVM-RFE). These optimal features are selected to construct a final predictor. The results of jackknife tests on four working datasets show that our method obtains satisfactory prediction accuracies by extracting features solely based on PSSM and could serve as a very promising tool to predict protein structural class.
biochemistry & molecular biology,chemistry, multidisciplinary
What problem does this paper attempt to address?