Abstract:CC-BY-NC-ND 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was. Abstract Functional relationship networks, which reveal the collaborative roles between genes, have significantly accelerated our understanding of gene functions and phenotypic relevance. However, establishing such networks for alternatively spliced isoforms remains a difficult, unaddressed problem due to the lack of systematic functional annotations at the isoform level, which renders most supervised learning methods difficult to be applied to isoforms. Here we describe a novel multiple instance learning-based probabilistic approach that integrates large-scale, heterogeneous genomic datasets, including RNA-seq, exon array, protein docking and pseudo-amino acid composition, for modeling a global functional relationship network at the isoform level in the mouse. Using this approach, we formulate a gene pair as a set of isoform pairs of potentially different properties. Through simulation and cross-validation studies, we showed the superior accuracy of our algorithm in revealing the isoform-level functional relationships. The local networks reveal functional diversity of the isoforms of the same gene, as demonstrated by both large-scale analyses and experimental and literature evidence for the disparate functions revealed for the isoforms of Ptbp1 and Anxa6 by our network. Our work can assist the understanding of the diversity of functions achieved by alternative splicing of a limited set of genes in mammalian genomes, and may shift the current gene-centered network prediction paradigm to the isoform level. Author summary Proteins carry out their functions through interacting with each other. Such interactions can be achieved through direct physical interactions, genetic interactions, or co-regulation. To summarize these interactions, researches have established functional relationship networks, in which each gene is represented as a node and the connections between the nodes represent how likely two genes work in the same biological process. Currently, these networks are established at the gene level only, while each gene, in mammalian systems, can be alternatively spliced into multiple isoforms that may have drastically different interaction partners. This information can be mined through integrating data that provide isoform-level information, such as RNA-seq and protein docking scores predicted from amino acid sequences. In this study, we developed a novel algorithm to integrate such data for predicting isoform-level functional relationship networks, which allows us to investigate the collaborative roles between genes at a high resolution.

A Proteogenomic Approach to Understand Splice Isoform Functions Through Sequence and Expression-Based Computational Modeling.

The Emerging Era of Genomic Data Integration for Analyzing Splice Isoform Function.

Systematically Differentiating Functions for Alternatively Spliced Isoforms Through Integrating RNA-seq Data.

Modeling the functional relationship network at the isoform level through heterogeneous data integration

Annotation of Alternatively Spliced Proteins and Transcripts with Protein-Folding Algorithms and Isoform-Level Functional Networks.

Modeling the functional relationship network at the splice isoform level through heterogeneous data integration

Revisiting the Identification of Canonical Splice Isoforms Through Integration of Functional Genomics and Proteomics Evidence

From computational models of the splicing code to regulatory mechanisms and therapeutic implications

In silico and in cellulo approaches for functional annotation of human protein splice variants

IsoResolve: predicting splice isoform functions by integrating gene and isoform-level features with domain adaptation

GraphIsoFun - a Graph Neural Network Based Approach for Splice Isoform Function Prediction.

Genome-Wide Functional Annotation of Human Protein-Coding Splice Variants Using Multiple Instance Learning.

Tissue Specificity Based Isoform Function Prediction

A Network of Splice Isoforms for the Mouse

Alternative splicing: Human disease and quantitative analysis from high-throughput sequencing

Differentiating Isoform Functions with Collaborative Matrix Factorization.

A systematic analysis of the effects of splicing on the diversity of post-translational modifications in protein isoforms

High-resolution functional annotation of human transcriptome: predicting isoform functions by a novel multiple instance-based label propagation method

Survey of Programs Used to Detect Alternative Splicing Isoforms from Deep Sequencing Data In Silico

A mixed model approach for joint genetic analysis of alternatively spliced transcript isoforms using RNA-Seq data

Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing