Incorporating Key Position and Amino Acid Residue Features to Identify General and Species-Specific Ubiquitin Conjugation Sites

Xiang Chen,Jian-Ding Qiu,Shao-Ping Shi,Sheng-Bao Suo,Shu-Yun Huang,Ru-Ping Liang
DOI: https://doi.org/10.1093/bioinformatics/btt196
IF: 5.8
2013-01-01
Bioinformatics
Abstract:Motivation: Systematic dissection of the ubiquitylation proteome is emerging as an appealing but challenging research topic because of the significant roles ubiquitylation play not only in protein degradation but also in many other cellular functions. High-throughput experimental studies using mass spectrometry have identified many ubiquitylation sites, primarily from eukaryotes. However, the vast majority of ubiquitylation sites remain undiscovered, even in well-studied systems. Because mass spectrometry-based experimental approaches for identifying ubiquitylation events are costly, time-consuming and biased toward abundant proteins and proteotypic peptides, in silico prediction of ubiquitylation sites is a potentially useful alternative strategy for whole proteome annotation. Because of various limitations, current ubiquitylation site prediction tools were not well designed to comprehensively assess proteomes.Results: We present a novel tool known as UbiProber, specifically designed for large-scale predictions of both general and species-specific ubiquitylation sites. We collected proteomics data for ubiquitylation from multiple species from several reliable sources and used them to train prediction models by a comprehensive machine-learning approach that integrates the information from key positions and key amino acid residues. Cross-validation tests reveal that UbiProber achieves some improvement over existing tools in predicting species-specific ubiquitylation sites. Moreover, independent tests show that UbiProber improves the areas under receiver operating characteristic curves by similar to 15% by using the Combined model.
What problem does this paper attempt to address?