A novel algorithm for identifying protein kinases associated with phosphorylation sites based on Bayesian decision theory

Liang ZOU,Ao LI,Yan HAN,Huanqing FENG,Minghui WANG
DOI: https://doi.org/10.3969/j.issn.1002-3208.2014.03.08
2014-01-01
Abstract:Objective A novel machine learning method is proposed to identify protein kinase for known phosphorylation sites,which can solve the problem of lacking kinase information.Methods According to the hierarchy structure of human kinases,we firstly constructed datasets for each kinase or kinase cluster by using the kinase-specific phosphorylation instances extracted from the latest version of Phospho.ELM(9.0).Based on Bayesian decision theory,we analyzed the amino acid distribution of each residue around the phosphorylation sites in positive and negative dataset respectively and constructed corresponding statistical models.In addition, we evaluated the performance of this algorithm by using leave one out strategy in various datasets.Results The sensitivities of MAPK,PKA and RSK reached 23%,24% and 33% when the false positive rate was 1%.The prediction performance was also significantly better than phosphorylation site prediction methods such as KinasePhos and Netphosk.Conclusions The proposed algorithm based on Bayesian decision theory effectively enhanced the identification performance and contributed to better understanding of the biological mechanism in protein phosphorylation process.
What problem does this paper attempt to address?