Abstract:Dynamic Graph Neural Network (DGNN) has recently attracted a significant amount of research attention from various domains, because most real-world graphs are inherently dynamic. Despite many research efforts, for DGNN, existing hardware/software solutions still suffer significantly from redundant computation and memory access overhead, because they need to irregularly access and recompute all graph data of each graph snapshot. To address these issues, we propose an efficient redundancy-aware accelerator, RACE , which enables energy-efficient execution of DGNN models. Specifically, we propose a redundancy-aware incremental execution approach into the accelerator design for DGNN to instantly achieve the output features of the latest graph snapshot by correctly and incrementally refining the output features of the previous graph snapshot and also enable regular accesses of vertices’ input features. Through traversing the graph on the fly, RACE identifies the vertices that are not affected by graph updates between successive snapshots to reuse these vertices’ states (i.e., their output features) of the previous snapshot for the processing of the latest snapshot. The vertices affected by graph updates are also tracked to incrementally recompute their new states using their neighbors’ input features of the latest snapshot for correctness. In this way, the processing and accessing of many graph data that are not affected by graph updates can be correctly eliminated, enabling smaller redundant computation and memory access overhead. Besides, the input features, which are accessed more frequently, are dynamically identified according to graph topology and are preferentially resident in the on-chip memory for less off-chip communications. Experimental results show that RACE achieves on average 1139× and 84.7× speedups for DGNN inference, with average 2242× and 234.2× energy savings, in comparison with the state-of-the-art software DGNN running on Intel Xeon CPU and NVIDIA A100 GPU, respectively. Moreover, for DGNN inference, RACE obtains on average 13.1×, 11.7×, 10.4×, and 7.9× speedup and 14.8×, 12.9×, 11.5×, and 8.9× energy savings over the state-of-the-art Graph Neural Network accelerators, i.e., AWB-GCN, GCNAX, ReGNN, and I-GCN, respectively.

ReaDy: A ReRAM-Based Processing-in-Memory Accelerator for Dynamic Graph Convolutional Networks.

Accelerating Graph Convolutional Networks Using Crossbar-based Processing-In-Memory Architectures

ReGNN: a ReRAM-based heterogeneous architecture for general graph neural networks

PIMGCN: A ReRAM-Based PIM Design for Graph Convolutional Network Acceleration

GraphR: Accelerating Graph Processing Using ReRAM

PASGCN: An ReRAM-Based PIM Design for GCN With Adaptively Sparsified Graphs

NEM-GNN - DAC/ADC-less, scalable, reconfigurable, graph and sparsity-aware near-memory accelerator for graph neural networks

A heterogeneous 3-D stacked PIM accelerator for GCN-based recommender systems

ReDy: A Novel ReRAM-centric Dynamic Quantization Approach for Energy-efficient CNN Inference

A Task-Adaptive In-Situ ReRAM Computing for Graph Convolutional Networks

DyGA: A Hardware-Efficient Accelerator with Traffic-Aware Dynamic Scheduling for Graph Convolutional Networks.

An Efficient ReRAM-based Accelerator for Asynchronous Iterative Graph Processing

Data Pruning-enabled High Performance and Reliable Graph Neural Network Training on ReRAM-based Processing-in-Memory Accelerators

Re2PIM

RACE: an Efficient Redundancy-aware Accelerator for Dynamic Graph Neural Network.

GIM: Versatile GNN Acceleration with Reconfigurable Processing-in-Memory

DRGN: a dynamically reconfigurable accelerator for graph neural networks

ARAS: An Adaptive Low-Cost ReRAM-Based Accelerator for DNNs

Fully Binarized Graph Convolutional Network Accelerator Based on In‐Memory Computing with Resistive Random‐Access Memory

Ragra: Leveraging Monolithic 3d Reram For Massively-Parallel Graph Processing

HURRY: Highly Utilized, Reconfigurable ReRAM-based In-situ Accelerator with Multifunctionality