Phosphorylation Site Prediction Integrating The Position Feature With Sequence Evolution Information

Tan Si-Qiao,Li Qian,Chen Yuan,Peng Jian
DOI: https://doi.org/10.16476/j.pibb.2016.0351
2017-01-01
PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS
Abstract:Phosphorylation is the major post-translation modification to proteins, and it can be classified as kinase-specific and non-kinase-specific. This paper focuses on the prediction methods of non-kinase-specificity and using Dou' s dataset of phosphorylation sites as the template, this paper develops a position-based chi-square table feature, chi(2)-pos, and then integrates this feature with the pseudo position-specific scoring matrix (PsePSSM). A Support Vector Machine (SVM) classifier with balanced positive and negative samples was created, and the S, T, Y independent testing results for the Matthew correlation coefficient, the inferior surface integral of the ROC curve and the precision were (0.59, 0.87, 79.74%), (0.55, 0.85, 77.68%) and (0.50, 0.81, 75.22%), respectively, which are significantly superior to the results reported previously. The integration of the chi(2)-pos and the PsePSSM offers a promising method to predict phosphorylation sites more accurately in proteins.
What problem does this paper attempt to address?