DiscoGNN: A Sample-Efficient Framework for Self-Supervised Graph Representation Learning

Jun Xia,Shaorong Chen,Yue Liu,Zhangyang Gao,Jiangbin Zheng,Xihong Yang,Stan Z. Li
DOI: https://doi.org/10.1109/icde60146.2024.00223
2024-01-01
Abstract:Self-supervised graph representation learning has received increasing research interest recently, with generative and contrastive modeling being two dominant ways. Typically, generative learning first masks parts of each graph and then recovers the masked parts based on the encoding results of the corrupted graph. However, these methods only mask fixed parts of each graph and fail to train on all the nodes and edges, which hinders them from getting the most out of each graph. As a remedy, we propose a novel self-supervised strategy, dubbed DetCor, where we first randomly replace some nodes and edges with alternative ones and then pre-train GNNs to detect and correct the replaced ones from all the nodes and edges. Additionally, for graph-level learning, the vanilla contrastive framework cannot reflect the distinction between the in-batch negatives. To alleviate this issue, we propose RankGCL, which enables the contrastive framework to capture the similarity ranking information between graphs and shows special superiority in graph similarity-based practical tasks. DetCor and RankGCL together constitute a unified self-supervised framework, DiscoGNN, which matches or outperforms state-of-the-art strategies on multiple datasets from various domains. Also, DiscoGNN is a sample-efficient framework that can achieve better performance than competitive methods with much less pre-training data. We release the codes at: https://github.com/junxia97/DiscoGNN-ICDE.
What problem does this paper attempt to address?