Subgraph Networks Based Contrastive Learning

Jinhuan Wang,Jiafei Shao,Zeyu Wang,Shanqing Yu,Qi Xuan,Xiaoniu Yang
2024-03-30
Abstract:Graph contrastive learning (GCL), as a self-supervised learning method, can solve the problem of annotated data scarcity. It mines explicit features in unannotated graphs to generate favorable graph representations for downstream tasks. Most existing GCL methods focus on the design of graph augmentation strategies and mutual information estimation operations. Graph augmentation produces augmented views by graph perturbations. These views preserve a locally similar structure and exploit explicit features. However, these methods have not considered the interaction existing in subgraphs. To explore the impact of substructure interactions on graph representations, we propose a novel framework called subgraph network-based contrastive learning (SGNCL). SGNCL applies a subgraph network generation strategy to produce augmented views. This strategy converts the original graph into an Edge-to-Node mapping network with both topological and attribute features. The single-shot augmented view is a first-order subgraph network that mines the interaction between nodes, node-edge, and edges. In addition, we also investigate the impact of the second-order subgraph augmentation on mining graph structure interactions, and further, propose a contrastive objective that fuses the first-order and second-order subgraph information. We compare SGNCL with classical and state-of-the-art graph contrastive learning methods on multiple benchmark datasets of different domains. Extensive experiments show that SGNCL achieves competitive or better performance (top three) on all datasets in unsupervised learning settings. Furthermore, SGNCL achieves the best average gain of 6.9\% in transfer learning compared to the best method. Finally, experiments also demonstrate that mining substructure interactions have positive implications for graph contrastive learning.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is that existing Graph Contrastive Learning (GCL) methods fail to fully consider the interactions between subgraphs when processing graph data. Most existing GCL methods mainly focus on the design of graph augmentation strategies and mutual information estimation operations. These methods generate augmented views through graph perturbations, preserving local similar structures and mining explicit features. However, they overlook the interactive relationships between subgraphs, which may have a significant impact on graph representation. To explore the impact of substructure interactions on graph representation, the authors propose a new framework—Subgraph Network-based Contrastive Learning (SGNCL). SGNCL introduces a subgraph network generation strategy that transforms the original graph into an edge-to-node mapping network while retaining topological and attribute features. This framework not only captures the interaction information between nodes, node-edges, and edges but also further investigates the impact of second-order subgraph augmentation on mining graph structure interactions and proposes a contrastive objective function that integrates first-order and second-order subgraph information. Specifically, the main contributions of SGNCL include: 1. Combining Subgraph Networks (SGN) with graph contrastive learning, proposing a new framework that captures the interaction information hidden between substructures in the original graph. 2. Introducing a subgraph network generation strategy in the graph augmentation module, converting edges to nodes from both graph topology and graph attribute perspectives. 3. Constructing and exploring the impact of first-order and second-order SGN on contrastive learning and downstream classification tasks. In addition to the contrastive loss of single-order SGN, a contrastive learning loss based on multi-order SGN fusion is proposed to simultaneously learn first-order and second-order subgraph feature information. 4. Extensive experiments demonstrate that SGNCL has competitive or better performance compared to classical and state-of-the-art contrastive learning methods on benchmark datasets from multiple different domains. Through these innovations, SGNCL aims to overcome the limitations of existing GCL methods and improve the effectiveness of graph representation learning, especially in unsupervised learning and transfer learning scenarios.