G-NMP: Accelerating Graph Neural Networks with DIMM-based Near-Memory Processing
Teng Tian,Xiaotian Wang,Letian Zhao,Wei Wu,Xuecang Zhang,Fangmin Lu,Tianqi Wang,Xi Jin
DOI: https://doi.org/10.1016/j.sysarc.2022.102602
IF: 5.836
2022-08-01
Journal of Systems Architecture
Abstract:Graph Neural Networks (GNNs) are of great value in numerous applications and promote the development of cognitive intelligence, due to the capability of modeling non-euclidean data structures. However, the inherent irregularity makes GNNs memory-bound, and the hybrid computing paradigm of GNNs poses significant challenges for efficient deployment on existing hardware architectures. Near-Memory Processing (NMP) is a promising solution for alleviating the memory wall problem. In this paper, we present G-NMP, a practical and efficient DIMM-based NMP solution for accelerating GNNs, which accelerates both sparse Aggregation and dense Combination computations on DIMM for the first time. We propose a novel G-NMP hardware architecture to exploit rank-level memory parallelism efficiently, and the G-ISA instructions to reduce host memory requests significantly. We conduct several data flow optimizations on the G-NMP to improve memory-compute overlap and to realize efficient matrix computation. Then we develop an adaptive data allocation strategy for diverse vector sizes to further exploit feature-level parallelism. We also propose a novel memory request scheduling method to achieve flexible and low-overhead DRAM ownership transition between host and G-NMP. Overall, G-NMP achieves consistent performance advantages across diverse GNN models and datasets, and offers 1.46× overall performance and 1.29× energy efficiency on average compared with the state-of-the-art work.
computer science, software engineering, hardware & architecture