PC-SENE: A Node Embedding Based Method for Protein Complex Detection

Xiaoxia Liu,Zhihao Yang,Shengtian Sang,Lei Wang,Yin Zhang,Hongfei Lin,Bo Xu,Yijia Zhang,Liang Yang,Kan Xu,Jian Wang
DOI: https://doi.org/10.1109/bibm.2018.8621338
2018-01-01
Abstract:With the accumulation of protein-protein interaction (PPI) datasets, various computational methods have been developed for identifying protein complexes from PPI networks. However, many exiting computational methods have their own limitations: supervised learning approaches need tedious effort for feature engineering and the quality measures used to guide the mining process of unsupervised methods have some drawbacks in reflecting the properties of a protein complex in PPI networks. In this work, we proposed a novel protein complex detection method, named PC-SENE. For given seeds, it uses alias sampling strategy based on protein node embedding similarities to select potential addable nodes, and makes use of a new conductance measure to decide whether to extend current candidate subgraph in order to find protein complexes. Intuitively, a well trained node embedding vector could preserve both the topological characteristics of the PPI network and the diversity of connectivity patterns of nodes in the network, and thus node embedding similarities can better reflect the relationship between nodes. The experimental results show the robustness and effectiveness of PC-SENE.
What problem does this paper attempt to address?