Lu Ma,Zeang Sheng,Xunkai Li,Xinyi Gao,Zhezheng Hao,Ling Yang,Wentao Zhang,Bin Cui
Abstract:Graph Neural Networks (GNNs) have demonstrated effectiveness in various graph-based tasks. However, their inefficiency in training and inference presents challenges for scaling up to real-world and large-scale graph applications. To address the critical challenges, a range of algorithms have been proposed to accelerate training and inference of GNNs, attracting increasing attention from the research community. In this paper, we present a systematic review of acceleration algorithms in GNNs, which can be categorized into three main topics based on their purpose: training acceleration, inference acceleration, and execution acceleration. Specifically, we summarize and categorize the existing approaches for each main topic, and provide detailed characterizations of the approaches within each category. Additionally, we review several libraries related to acceleration algorithms in GNNs and discuss our Scalable Graph Learning (SGL) library. Finally, we propose promising directions for future research. A complete summary is presented in our GitHub repository:
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the inefficiency of Graph Neural Networks (GNNs) during the training and inference processes. Specifically, GNNs face significant time - complexity and memory - complexity challenges when processing large - scale graph data, and these problems limit their scalability in practical applications. To address these challenges, researchers have proposed a variety of acceleration algorithms, aiming to improve the training and inference efficiency of GNNs and reduce storage and computational consumption.
### Main contributions of the paper
1. **Systematic review**: The paper conducts a systematic review of GNNs acceleration algorithms and divides them into three major categories: training acceleration, inference acceleration, and execution acceleration.
2. **Classification and summary**: For each category of acceleration algorithms, the paper summarizes and classifies the existing methods and provides detailed feature descriptions.
3. **Review of related libraries**: The paper also reviews some libraries related to GNNs acceleration and introduces the author's own Scalable Graph Learning (SGL) library.
4. **Future research directions**: The paper proposes promising directions for future research.
### Specific problems and solutions
- **Time - complexity challenges**:
- **Neighbor explosion**: GNNs recursively aggregate the information of neighbor nodes at each layer, resulting in an exponential growth in the number of neighbors and an increase in computational cost.
- **Discontinuous data access**: Due to the irregularity of graph data, data is usually not stored continuously, leading to high I/O overhead.
- **Memory - complexity challenges**:
- **Full - batch training**: Full - batch training of GNNs requires storing the entire graph in GPU memory, which can lead to memory overflow when processing large - scale graphs.
- **Activation output storage**: During the back - propagation process, it is necessary to store the activation output of each layer to calculate the gradient. As the number of GNN layers increases, this will occupy more and more GPU memory.
### Acceleration methods
1. **Training acceleration**:
- **Graph sampling**: Approximate node representations by sampling sub - graphs to reduce memory consumption. For example, GraphSAGE, PinSAGE, VR - GCN, etc.
- **GNN simplification**: Separate the propagation operation and the transformation operation by pre - computing the propagation features or propagating the prediction in the post - processing stage. For example, SGC, SIGN, GBP, etc.
2. **Inference acceleration**:
- **Knowledge distillation**: Extract knowledge from large - scale teacher models and transfer it to small - scale student models to improve the inference speed. For example, LSP, TinyGNN, GFKD, etc.
- **Quantization**: Convert model parameters and activations from high - precision representations to low - precision representations to reduce computational and memory consumption. For example, Degree - Quant, SGQuant, etc.
- **Pruning**: Remove redundant weights and nodes in the model to reduce the model size and the amount of computation. For example, UGS, ICPG, etc.
3. **Execution acceleration**:
- **Binarization**: Binarize model parameters and activations to further reduce computational and memory consumption. For example, Bi - GCN, BGN, etc.
- **Graph compression**: Reduce the scale of the graph through graph compression techniques to improve computational efficiency. For example, GCond, SFGC, etc.
### Conclusion
Through a systematic review of GNNs acceleration algorithms, the paper provides a comprehensive reference framework for researchers and practitioners, which is helpful for promoting the practical applications of GNNs on large - scale graph data.