SoGraph: A State-Aware Architecture for Out-of-Memory Graph Processing on HBM-Equipped FPGAs

Qianyu Cheng,Zhendong Zheng,Tianhao Jiang,Cheng Tang,Teng Wang,Lei Gong,Chao Wang,Xuehai Zhou
DOI: https://doi.org/10.1109/fpl64840.2024.00021
2024-01-01
Abstract:Emerging FPGA devices are widely used in graph processing and achieve high performance by exploiting fine-grained parallelism and high-bandwidth memory (HBM) sub-systems with dozens of channels. However, the graph size increases rapidly and often exceeds the capacity of the FPGA device or one of the memory channels, incurring out-of-memory (OoM) faults and performance degradation. Existing approaches focus on improving the performance of on-device full graph processing, ignoring the support for hyper-scale graphs and cross-iteration acceleration. In this work, we propose SoGraph, a graph processing architecture for large graph acceleration on HBM-equipped FPGAs, cooperating with the host. To minimize PCIe traffic during the processing, we adopt three hardware-oriented optimizations: 1) activeness-driven subgraph scheduler, which minimizes on-device graph resident set; 2) lightweight state-aware processing extension in edge-centric paradigm; 3) bit-level update analyzer for to-be-transfer subgraph reduction. The results show that the prototype on Alveo U280 FPGA achieves an average of 3.18x performance speedup over the modified state-of-the-art FPGA design and 1.3x energy efficiency over the GPU solution.
What problem does this paper attempt to address?