Abstract:Recently, graph convolutional networks (GCNs) have gained wide attention due to their ability to capture node relationships in graphs. One problem appears when full-batch GCN is trained on large graph datasets, where the computational and memory requirements are unacceptable. To address this issue, mini-batch GCN training is introduced to improve the scalability of GCN training for large datasets by sampling and training only a subset of the graph in each batch. Although several acceleration techniques have been designed for boosting the efficiency of full-batch GCN, they lack attention to mini-batch GCN, which differs from full-batch GCN in terms of the sampled dynamic graph structures. Based on our previous work GCNTrain [28], which was originally excogitated for accelerating full-batch GCN training, we devise GCNTrain+—a universal accelerator to tackle the performance bottlenecks associated with both full-batch and mini-batch GCN training. GCNTrain+ is equipped with two engines to optimize computation and memory access in GCN training, respectively. To reduce the computation overhead, we propose to dynamically reconfigure the computation order based on the varying data dimensions involved in each training batch. Moreover, we build a unified computation engine to perform the sparse-dense matrix multiplications (SpDM) and sparse-sparse matrix multiplications (SpSpM) discovered in GCN training uniformly. To alleviate the memory burden, we devise a two-phased dynamic clustering mechanism to capture data locality as well as customized hardware to reduce the clustering overhead. We evaluate GCNTrain+ on seven datasets, and the result shows that GCNTrain+ achieves 136.0 ×, 52.6 ×, 2.2 ×, and 1.5 × speedup over CPU, GPU, GCNAX, and GCNTrain in full-batch GCN training. Additionally, GCNTrain+ outperforms them with speedups of 131.6 ×, 67.1 ×, 4.4 ×, and 1.5 × in mini-batch GCN training.

MG-GCN: Fast and Effective Learning with Mix-grained Aggregators for Training Large Graph Convolutional Networks

Mixed Geometry Message and Trainable Convolutional Attention Network for Knowledge Graph Completion

Adaptive sampling towards fast graph representation learning

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

GCNTrain: A Unified and Efficient Accelerator for Graph Convolutional Neural Network Training.

SGCN: A Scalable Graph Convolutional Network with Graph-Shaped Kernels and Multi-Channels

A Subgraph Sampling Method for Training Large-Scale Graph Convolutional Network.

GCNTrain+: A Versatile and Efficient Accelerator for Graph Convolutional Neural Network Training

L2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Permutohedral-GCN: Graph Convolutional Networks with Global Attention

Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition

Resource-Efficient Training for Large Graph Convolutional Networks with Label-Centric Cumulative Sampling

MGPOOL: multi-granular graph pooling convolutional networks representation learning

CDGCN: an Effective and Efficient Algorithm Based on Community Detection for Training Deep and Large Graph Convolutional Networks

SuperGCN: General and Scalable Framework for GCN Training on CPU-powered Supercomputers

Clustering with Entropy-based Recombination for Training GCNs on Large Graphs

Enhancing Graph Neural Networks by a High-quality Aggregation of Beneficial Information

Accurate, Efficient and Scalable Graph Embedding

Handling Over-Smoothing and Over-Squashing in Graph Convolution With Maximization Operation

Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture