QGABS: GPU Tensor Core-accelerated Quantized Graph Neural Network based on Adaptive Batch Size

Mengyang Zhang,Liwei Chen,Chao Shen
DOI: https://doi.org/10.1109/IAECST60924.2023.10503009
2023-12-08
Abstract:In recent years, Quantized Graph Neural Networks (QGNNs) have emerged as a hot topic of extensive research and industrial interest. While quantization techniques have been utilized in traditional graph neural networks to reduce model storage and computational costs, performance bottlenecks still exist when dealing with large-scale graph data. To address this issue, we first introduce a strategy of Adaptive batch size, dynamically adjusting batch sizes based on the characteristics of the graph data and computational resources. By adjusting batch sizes according to the actual conditions, we achieve a better balance between model computation load and memory consumption. Secondly, leveraging the parallel computing capability of GPU Tensor Cores, we effectively reduce the time consumption for model training and inference by optimizing tensor operations and data layout. This paper proposes a method based on Adaptive batch size, improving the training efficiency and performance of QGNNs by dynamically adjusting batch sizes and incorporating the parallel computing capability of GPU Tensor Cores. Through experiments, we demonstrate the effectiveness of this method, showing significant inference speedup ratios (average 2.36x) compared to the existing DGL framework in different environments. Compared to the state-of-the-art QGTC framework, QGABS (Quantized Graph Neural Network Accelerated by Adaptive Batch Size) also achieves noticeable inference speedup ratios (average 1.26x) in various environments.
Computer Science
What problem does this paper attempt to address?