High-resolution functional annotation of human transcriptome: predicting isoform functions by a novel multiple instance-based label propagation method

Wenyuan Li,Shuli Kang,Chun-Chi Liu,Shihua Zhang,Yi Shi,Yan Liu,Xianghong Jasmine Zhou
DOI: https://doi.org/10.1093/nar/gkt1362
IF: 14.9
2013-12-25
Nucleic Acids Research
Abstract:Alternative transcript processing is an important mechanism for generating functional diversity in genes. However, little is known about the precise functions of individual isoforms. In fact, proteins (translated from transcript isoforms), not genes, are the function carriers. By integrating multiple human RNA-seq data sets, we carried out the first systematic prediction of isoform functions, enabling high-resolution functional annotation of human transcriptome. Unlike gene function prediction, isoform function prediction faces a unique challenge: the lack of the training data--all known functional annotations are at the gene level. To address this challenge, we modelled the gene-isoform relationships as multiple instance data and developed a novel label propagation method to predict functions. Our method achieved an average area under the receiver operating characteristic curve of 0.67 and assigned functions to 15 572 isoforms. Interestingly, we observed that different functions have different sensitivities to alternative isoform processing, and that the function diversity of isoforms from the same gene is positively correlated with their tissue expression diversity. Finally, we surveyed the literature to validate our predictions for a number of apoptotic genes. Strikingly, for the famous 'TP53' gene, we not only accurately identified the apoptosis regulation function of its five isoforms, but also correctly predicted the precise direction of the regulation.
biochemistry & molecular biology
What problem does this paper attempt to address?