Abstract:Training graph neural networks (GNNs) with good generalizability on large-scale graphs is a challenging problem. Existing methods mainly divide the input graph into multiple subgraphs and train them in different batches to improve training scalability. However, the local batches obtained by such a strategy could contain topological bias compared with the complete graph structure. It has been studied that the topological bias results in more significant gaps between training and testing performances, or worse generalization robustness. A straightforward solution is to utilize contrastive learning, and train node embeddings to be robust and invariant among the augmented imperfect graphs. However, most of the existing work are inefficient by contrasting extensive node pairs at the large-scale graph. With random data augmentation, they may deteriorate the embedding process by transforming well-sampled batches into meaningless graph structures. To bridge the gap between large-scale graph training and contrastive learning, we propose adaptive subgraph contrastive learning (AdaGCL). Given a batch of sampled subgraphs, we propose subgraph-granularity contrastive loss to compare the anchor node with a limited number of subgraphs, which reduces the computation cost. AdaGCL tailors two key components for batch training: (1) Batch-aware view generation to keep the intrinsic individual subgraph structures of batch to learn the informative node embeddings; (2) Batch-aware pair sampling to construct the positive and negative contrasting subgraphs based on anchor node label. Experiments show that AdaGCL can scale up to graphs with millions of nodes, and delivers the consistent improvement than the existing methods on various benchmark datasets. Furthermore, AdaGCL has comparable running time with the state-of-the-art contrastive learning methods that focus on improving efficiency. Finally, ablation studies of the two components of AdaGCL demonstrate their effectiveness to generalize the batch training. The code is in: https://github.com/YL-wang/CIKM_AdaGCL/.

Graph Contrastive Learning with Node-Level Accurate Difference

Debiased Graph Contrastive Learning.

SimGRACE: A Simple Framework for Graph Contrastive Learning Without Data Augmentation

Enhancing Graph Contrastive Learning with Node Similarity

Graph Contrastive Learning with Adaptive Augmentation

Subgraph Networks Based Contrastive Learning

Localized Contrastive Learning on Graphs

Edge Contrastive Learning: An Augmentation-Free Graph Contrastive Learning Model

Graph Contrastive Learning with Adaptive Proximity-Based Graph Augmentation.

MDGCL: Graph Contrastive Learning Framework with Multiple Graph Diffusion Methods

Multi-Scale Self-Supervised Graph Contrastive Learning with Injective Node Augmentation

DPGCL: Dual pass filtering based graph contrastive learning

Localized Graph Contrastive Learning

Graph Contrastive Learning with Reinforced Augmentation

Neighbor Contrastive Learning on Learnable Graph Augmentation

Line graph contrastive learning for node classification

Augmentation-Free Graph Contrastive Learning of Invariant-Discriminative Representations

Adversarial Graph Augmentation to Improve Graph Contrastive Learning

Hyperedge Graph Contrastive Learning

AdaGCL: Adaptive Subgraph Contrastive Learning to Generalize Large-scale Graph Training

Graph Contrastive Representation Learning with Input-Aware and Cluster-Aware Regularization