PRAGA: A Priority-Aware Hardware/Software Co-design for High-Throughput Graph Processing Acceleration

Long Zheng,Bing Zhu,Pengcheng Yao,Yuhang Zhou,Chengao Pan,Wenju Zhao,Xiaofei Liao,Hai Jin,Jingling Xue
DOI: https://doi.org/10.1145/3701998
IF: 1.444
2024-01-01
ACM Transactions on Architecture and Code Optimization
Abstract:Graph processing is pivotal in deriving insights from complex data structures but faces performance limitations due to the irregular nature of graphs. Traditional general-purpose processors often struggle with low instruction-level parallelism and energy inefficiency when handling graph data. In response, modern graph accelerators have embraced an intra-edge-parallel model to enhance parallelization, significantly outperforming conventional processors. However, the indiscriminate processing of edges in existing systems results in substantial computational redundancy, negatively impacting overall efficiency. This paper introduces PRAGA, an innovative graph accelerator designed to optimize efficiency by selectively processing edges that significantly contribute to final results while preserving high computational parallelism. PRAGA utilizes an intra-edge-sequential model, prioritizing edge processing to capitalize on coarse-grained vertex-level parallelism and minimize unnecessary computations. It incorporates a hot-value manager to alleviate network-on-chip congestion and a memory-aware coalescer to minimize redundant data accesses. Our experimental results, obtained using a Xilinx Alveo U280 FPGA accelerator card, demonstrate that PRAGA achieves speedups of 17.88 × and 5.86 × over state-of-the-art accelerators ScalaGraph and GraphDyns, respectively, and outperforms the advanced GPU-based system Gunrock by 22.52 × on average. This substantial improvement underscores PRAGA’s potential to redefine performance benchmarks in graph processing.
What problem does this paper attempt to address?