Protein Function Prediction Based on Data Fusion and Functional Interrelationship

Jun Meng,Jael-Sanyanda Wekesa,Guan-Li Shi,Yu-Shi Luan
DOI: https://doi.org/10.1016/j.mbs.2016.02.001
IF: 3.935
2016-01-01
Mathematical Biosciences
Abstract:One of the challenging tasks of bioinformatics is to predict more accurate and confident protein functions from genomics and proteomics datasets. Computational approaches use a variety of high throughput experimental data, such as protein-protein interaction (PPI), protein sequences and phylogenetic profiles, to predict protein functions. This paper presents a method that uses transductive multi-label learning algorithm by integrating multiple data sources for classification. Multiple proteomics datasets are integrated to make inferences about functions of unknown proteins and use a directed bi-relational graph to assign labels to unannotated proteins. Our method, bi-relational graph based transductive multi-label function annotation (Bi-TMF) uses functional correlation and topological PPI network properties on both the training and testing datasets to predict protein functions through data fusion of the individual kernel result. The main purpose of our proposed method is to enhance the performance of classifier integration for protein function prediction algorithms. Experimental results demonstrate the effectiveness and efficiency of Bi-TMF on multi-sources datasets in yeast, human and mouse benchmarks. Bi-TMF outperforms other recently proposed methods.
What problem does this paper attempt to address?