Abstract:Protein complexes are a cornerstone of many biological processes and together they form various types of molecular machinery that perform a vast array of biological functions. An increase in the amount of protein-protein interaction (PPI) data enables a number of computational methods for predicting protein complexes. There are a mass of algorithms detecting complexes only consider the PPI data. However, the PPI data from high-throughout techniques is flooded with false interactions. In fact, the insufficiency of the PPI data significantly lowers the accuracy of these methods. In the current work, we develop a novel method named CMBI to discover protein complexes via the integration of multiple biological resources including gene expression profiles, essential protein information and PPI data. First, CMBI defines the functional similarity of each pair of interacting proteins based on the edge-clustering coefficient (ECC) from the PPI network and the Pearson correlation coefficient (PCC) from the gene expression data. Second, CMBI selects essential proteins as seeds to bnild the protein complex cores. During the growth process, the seeds' essential protein neighbors and the neighbors whose functional similarity (FS) with the seeds are more than the threshold T will be added to the complex cores. After the complex cores are constructed, CMBI begins to generate protein complexes by attaching their direct neighbors with F S >; T to the cores. In addition to the essential proteins, CMBI also uses other proteins as seeds to expand protein complexes. To check the performance of CMBI, we compare the complexes discovered by CMBI with the ones found by other techniques by matching the predicted complexes against the reference complexes. We use subsequently GO::TermFinder to analyze the complexes predicted by various methods. Finally, the effect of parameter T is investigated. The results from GO functional enrichment and matching analyses show that CMBI performs significantly better than the state-of-the-art methods. It means that it's successful for us to integrate multiple biological information to identify protein complexes in the PPI network.

Integrating experimental and literature protein-protein interaction data for protein complex prediction

Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection

A Method for Predicting Protein Complex in Dynamic PPI Networks

Integrating Multiple Biomedical Resources for Protein Complex Prediction

Predicting Protein Complexes Via the Integration of Multiple Biological Information

From function to interaction: a new paradigm for accurately predicting protein complexes based on protein-to-protein interaction networks

Protein Complex Identification by Integrating Protein-Protein Interaction Evidence from Multiple Sources

A novel method to predict protein complexes based on Gene Ontology in PPI networks

Integration of protein sequence and protein–protein interaction data by hypergraph learning to identify novel protein complexes

Assessing and Predicting Protein Interactions by Combining Manifold Embedding with Multiple Information Integration

Protein Complex Detection in PPI Networks Based on Data Integration and Supervised Learning Method

Data integration and supervised learning based protein complex detection method

Proteome-wide prediction of protein-protein interactions from high-throughput data

Improved protein interaction models predict differences in complexes between human cell lines

A survey of computational methods for protein complex prediction from protein interaction networks

Integrating Sequence And Network Information To Enhance Protein-Protein Interaction Prediction Using Graph Convolutional Networks

Ontology Integration to Identify Protein Complex in Protein Interaction Networks.

Identification of Protein Complexes from Multi-Relationship Protein Interaction Networks

Prediction of Protein-Protein Interactions from Protein Sequences by Combining MatPCA Feature Extraction Algorithms and Weighted Sparse Representation Models

Computational methods for the prediction of protein-protein interactions.

Protein Complex Prediction in Large Ontology Attributed Protein-Protein Interaction Networks