Prediction of Protein Methylation Sites Using Conditional Random Field.

Yan Xu,Jun Ding,Qiang Huang,Nai-Yang Deng
DOI: https://doi.org/10.2174/092986613804096865
2013-01-01
Protein and Peptide Letters
Abstract:Protein methylation is an important and reversible post-translational modification which regulates diverse protein properties. Many methylation sites on arginine and lysine have been identification through experiments. However, experimental identification without prior knowledge is laborious and costly. Hence, there is interest in the development of computational methods for reliable prediction of methylation sites. Prediction of methylation sites may provide researches with useful information for further productivity in methylation candidate sites discovery. This work proposes Methcrf, a computational predictor based on conditional random field (CRF) for predicting protein methylation sites limit to lysine and arginine residues due to the absence of enough experimentally verified data for other residues. The approach is developed to consider combining protein sequence features with structural information such as solvent accessibility of amino acids that surround the methylation sites. In 10-fold cross validation Methcrf can achieve the area under receiver operating characteristic curve (AUC) of 0.85 and 0.80 for arginine and lysine, respectively. The proposed method has comparable performance with previous methods for accurately predicting methylation sites.
What problem does this paper attempt to address?