DepGraph: A Dependency-Driven Accelerator for Efficient Iterative Graph Processing

Yu Zhang,Xiaofei Liao,Hai Jin,Ligang He,Bingsheng He,Haikun Liu,Lin Gu
DOI: https://doi.org/10.1109/hpca51647.2021.00039
2021-01-01
Abstract:Many graph processing systems have been recently developed for many-core processors. However, for iterative graph processing, due to the dependencies between vertices' states, the propagations of new states of vertices are inherently conducted along graph paths sequentially and are also dependent on each other. Despite the years' research effort, existing solutions still severely underutilize many-core processors to quickly propagate the new states of vertices, suffering from slow convergence speed. In this paper, we propose a dependency-driven programmable accelerator, DepGraph, which couples with the core architecture of the many-core processor and can fundamentally alleviate the challenge of dependencies for faster state propagation. Specifically, we propose an effective dependency-driven asynchronous execution approach into novel microarchitecture designs for faster state propagations. DepGraph prefetches the vertices for the core on-the-fly along the dependency chains between their states and the active vertices' new states, aiming to effectively accelerate the propagations of the active vertices' new states and also ensure better data locality. Through transforming the dependency chains along the frequently-used paths into direct ones at runtime and maintaining these calculated direct dependencies as a set of fast shortcuts, called hub index, DepGraph further accelerates most state propagations. Also, many propagations do not need to wait for the completion of other propagations, which enables more propagations to be effectively conducted along the paths with higher degree of parallelism. The experimental results show that for iterative graph processing on a simulated 64-core processor, a cutting-edge software graph processing system can achieve 5.0-22.7 times speedup after integrating with our DepGraph while incurring only 0.6% area cost. In comparison with three state-of-the-art hardware solutions, i.e., HATS, Minnow, and PHI, DepGraph improves the performance by up to 3.0-14.2, 2.2-5.8, and 2.4-10.1 times, respectively.
What problem does this paper attempt to address?