An Efficient GCN Accelerator Based on Workload Reorganization and Feature Reduction

Zhuang Shao,Chenjia Xie,Zihan Ning,Qi Wu,Liang Chang,Yuan Du,Li Du
DOI: https://doi.org/10.1109/tcsi.2023.3343515
2024-01-01
Abstract:The irregular adjacency matrix and the mismatched computation patterns of Aggregation and Combination phases make Graph Neural Networks (GNNs) challenging to compute efficiently. This paper proposes a software and hardware co-design system to reduce computational latency and memory access based on workload reorganization and feature reduction. In software, the adjacency matrix is preprocessed, and the workload in both feature and node dimensions is concentrated to optimize memory access and hardware utilization. The interlayer nodes are analyzed using Principal Component Analysis (PCA) to explore the minimum feature vector length based on information redundancy, and a unique weight initialization is utilized for retraining to trim the feature vector to the minimum length. In hardware, an efficient GCN accelerator is designed to fully support the reorganized workload by reconfigurable output node computation. The hardware accelerator is implemented using 28-nm CMOS technology. It achieves 3.3 TOPS peak throughput and 2.6 TOPS/W energy efficiency. Compared with HyGCN, this result shows that the proposed method can improve the overall performance by 5 $\times$ with a negligible accuracy loss of less than 0.5%.
What problem does this paper attempt to address?