Abstract:Protein–protein interactions (PPIs) are the basis of many important biological processes, with protein complexes being the key forms implementing these interactions. Understanding protein complexes and their functions is critical for elucidating mechanisms of life processes, disease diagnosis and treatment and drug development. However, experimental methods for identifying protein complexes have many limitations. Therefore, it is necessary to use computational methods to predict protein complexes. Protein sequences can indicate the structure and biological functions of proteins, while also determining their binding abilities with other proteins, influencing the formation of protein complexes. Integrating these characteristics to predict protein complexes is very promising, but currently there is no effective framework that can utilize both protein sequence and PPI network topology for complex prediction. To address this challenge, we have developed HyperGraphComplex, a method based on hypergraph variational autoencoder that can capture expressive features from protein sequences without feature engineering, while also considering topological properties in PPI networks, to predict protein complexes. Experiment results demonstrated that HyperGraphComplex achieves satisfactory predictive performance when compared with state-of-art methods. Further bioinformatics analysis shows that the predicted protein complexes have similar attributes to known ones. Moreover, case studies corroborated the remarkable predictive capability of our model in identifying protein complexes, including 3 that were not only experimentally validated by recent studies but also exhibited high-confidence structural predictions from AlphaFold-Multimer. We believe that the HyperGraphComplex algorithm and our provided proteome-wide high-confidence protein complex prediction dataset will help elucidate how proteins regulate cellular processes in the form of complexes, and facilitate disease diagnosis and treatment and drug development. Source codes are available at https://github.com/LiDlab/HyperGraphComplex.

A supervised protein complex prediction method with network representation learning and gene ontology knowledge

Protein Complex Prediction in Large Ontology Attributed Protein-Protein Interaction Networks

Protein Complexes Prediction Via Positive and Unlabeled Learning of the PPI Networks

Predicting protein complexes using a supervised learning method combined with local structural information

CPredictor3.0: detecting protein complexes from PPI networks with expression data and functional annotations

A Novel Network-Based Algorithm for Predicting Protein-Protein Interactions Using Gene Ontology

Computational Methods for Protein Complex Prediction:A Survey

An effective approach to detecting both small and large complexes from protein-protein interaction networks

Integration of protein sequence and protein–protein interaction data by hypergraph learning to identify novel protein complexes

Small protein complex prediction algorithm based on protein–protein interaction network segmentation

Improving protein function prediction using domain and protein complexes in PPI networks

Combining sequence and network information to enhance protein–protein interaction prediction

An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks

Prediction Of Protein-Protein Interactions Using Subcellular And Functional Localizations

Identifying Protein Complexes With Clear Module Structure Using Pairwise Constraints in Protein Interaction Networks

A survey of computational methods for protein complex prediction from protein interaction networks

CPredictor 4.0: Effectively Detecting Protein Complexes in Weighted Dynamic PPI Networks

Multi-view heterogeneous molecular network representation learning for protein–protein interaction prediction

Protein Function Prediction With Functional and Topological Knowledge of Gene Ontology

Methods for protein complex prediction and their contributions towards understanding the organization, function and dynamics of complexes

Protein Function Prediction by Collective Classification with Explicit and Implicit Edges in Protein-Protein Interaction Networks