Abstract:Many proteins work together with others in groups called complexes in order to achieve a specific function. Discovering protein complexes is important for understanding biological processes and predict protein functions in living organisms. Large-scale and throughput techniques have made possible to compile protein-protein interaction networks (PPI networks), which have been used in several computational approaches for detecting protein complexes. Those predictions might guide future biologic experimental research. Some approaches are topology-based, where highly connected proteins are predicted to be complexes; some propose different clustering algorithms using partitioning, overlaps among clusters for networks modeled with unweighted or weighted graphs; and others use density of clusters and information based on protein functionality. However, some schemes still require much processing time or the quality of their results can be improved. Furthermore, most of the results obtained with computational tools are not accompanied by an analysis of false positives. We propose an effective and efficient mining algorithm for discovering highly connected subgraphs, which is our base for defining protein complexes. Our representation is based on transforming the PPI network into a directed acyclic graph that reduces the number of represented edges and the search space for discovering subgraphs. Our approach considers weighted and unweighted PPI networks. We compare our best alternative using PPI networks from Saccharomyces cerevisiae (yeast) and Homo sapiens (human) with state-of-the-art approaches in terms of clustering, biological metrics and execution times, as well as three gold standards for yeast and two for human. Furthermore, we analyze false positive predicted complexes searching the PDBe (Protein Data Bank in Europe) database in order to identify matching protein complexes that have been purified and structurally characterized. Our analysis shows that more than 50 yeast protein complexes and more than 300 human protein complexes found to be false positives according to our prediction method, i.e., not described in the gold standard complex databases, in fact contain protein complexes that have been characterized structurally and documented in PDBe. We also found that some of these protein complexes have recently been classified as part of a Periodic Table of Protein Complexes. The latest version of our software is publicly available at http://doi.org/10.6084/m9.figshare.5297314.v1.

Complexes Detection in Biological Networks via Diversified Dense Subgraphs Mining

Detection of Complexes in Biological Networks Through Diversified Dense Subgraph Mining

Accurately Detecting Protein Complexes by Graph Embedding and Combining Functions with Interactions

Protein complex prediction via dense subgraphs and false positive analysis

Timing of parturition and postpartum mating in norway rats: Interaction of an interval timer and a circadian gate

Identifying Protein Complexes With Clear Module Structure Using Pairwise Constraints in Protein Interaction Networks

Independence in Possibility Theory under Different Triangular Norms

An effective approach to detecting both small and large complexes from protein-protein interaction networks

The mediating effect of cognitive development on children's worry elaboration.

An efficient protein complex mining algorithm based on Multistage Kernel Extension

Identifying Dynamic Protein Complexes Based on Gene Expression Profiles and Ppi Networks

Integration of protein sequence and protein–protein interaction data by hypergraph learning to identify novel protein complexes

Mining Dense Overlapping Subgraphs in weighted protein-protein interaction networks

Employing functional interactions for characterization and detection of sparse complexes from yeast PPI networks

An Effective Link-Based Clustering Algorithm for Detecting Overlapping Protein Complexes in Protein-Protein Interaction Networks

Protein Complexes Prediction Via Positive and Unlabeled Learning of the PPI Networks

Predicting Protein Complexes Via the Integration of Multiple Biological Information

A novel graph clustering method with a greedy heuristic search algorithm for mining protein complexes from dynamic and static PPI networks

CPredictor3.0: detecting protein complexes from PPI networks with expression data and functional annotations

Total variation norm for three-dimensional iterative reconstruction in limited view angle tomography

Hierarchical hidden community detection for protein complex prediction