A Variational Graph Partitioning Approach to Modeling Protein Liquid-liquid Phase Separation

Gaoyuan Wang,Jonathan Warrell,Suchen Zheng,Mark Gerstein
DOI: https://doi.org/10.1101/2024.01.20.576375
2024-11-01
Abstract:Graph Neural Network (GNN)s have emerged as a powerful general-purpose tool for representation learning across many domains. Their efficacy often depends on having an optimal underlying graph for prediction. In many cases, the most relevant information comes from specific subgraphs. In this work, we introduce a novel GNN architecture, called Graph Partitioned GNN (GP-GNN), designed to partition graphs during the prediction process, thereby focusing attention on subgraphs containing the most relevant information. Our approach jointly learns task-dependent graph partitions and representations, making it particularly effective for predictive tasks where critical features may reside within initially unidentified subgraphs. Protein Liquid-Liquid Phase Separation (LLPS) is an important physical problem well-suited to our architecture, primarily because protein sub-domains called intrinsically disordered regions (IDR)s are known to play a crucial role in the phase separation process. LLPS plays an essential role in cellular processes and is known to be associated with various diseases (e.g., neurodegenerative diseases and cancer). However, our ability to accurately predict which proteins undergo LLPS remains limited. In this study, we demonstrate how GP-GNN can be utilized to accurately predict LLPS by partitioning protein graphs into task-relevant subgraphs, such as those highlighting IDRs. Our model not only achieves state-of-the-art accuracy in predicting LLPS for both regulator and scaffold proteins but also offers valuable biological insights, that can be used to guide further downstream investigations. Notably, upon examining subgraphs identified by the GP-GNN, we show these are consistent with annotated IDRs.
Bioinformatics
What problem does this paper attempt to address?