A Subgraph Sampling Method for Training Large-Scale Graph Convolutional Network.

Qi Zhang,Yanfeng Sun,Yongli Hu,Shaofan Wang,Baocai Yin
DOI: https://doi.org/10.1016/j.ins.2023.119661
IF: 8.1
2023-01-01
Information Sciences
Abstract:Graph Convolutional Network (GCN) is a powerful model for graph representation learning. Since GCN updates nodes with a recursive neighbor aggregation scheme, training GCN on large-scale graphs suffers from enormous computational cost and large memory requirement. The subgraph sampling method trains GCN on sampled small-scale subgraphs to speed up GCN. However, they also suffer from problems, such as training GCN on unconnected and scale-unbalanced subgraphs, which reduce performance and efficiency. Moreover, existing subgraph sampling methods train GCN on subgraphs independently and ignore the relation information among different subgraphs. This paper proposes a novel subgraph sampling method, Improved Adaptive Neighbor Sampling (IANS), and a novel loss function, Subgraph Contrastive Loss. Subgraphs sampled by the IANS method are scale-balanced, inside nodes are significantly relevant, and the sample ratio controls the sparsity of subgraphs. To recover the lost relation information between different subgraphs, the Subgraph Contrastive Loss is defined, which constrains the initially connected nodes in different subgraphs to be closer and pushes unconnected nodes far away in feature space. A series of experiments are conducted, which train GCN with IANS and Subgraph Contrastive Loss for node classification on three different scale datasets. The training time and classification accuracy demonstrate the effectiveness of the proposed method.
What problem does this paper attempt to address?