GraphFly: Efficient Asynchronous Streaming Graphs Processing Via Dependency-Flow

Dan Chen,Chuangyi Gui,Yi Zhang,Hai Jin,Long Zheng,Yu Huang,Xiaofei Liao
DOI: https://doi.org/10.1109/sc41404.2022.00050
2022-01-01
Abstract:Existing streaming graph processing systems typically adopt two phases of refinement and recomputation to ensure the correctness of the incremental computation. However, severe redundant memory accesses exist due to the unnecessary synchronization among independent edge updates. In this paper, we present GraphFly, a high-performance asynchronous streaming graph processing system based on dependency-flows. GraphFly features three key designs: 1) Dependency trees (D-trees), which helps quickly identify independent graph updates with low cost; 2) Dependency-flow based processing model, which exploits the space-time dependent co-scheduling for cache efficiency; 3) Specialized graph data layout, which further reduces memory accesses. We evaluate GraphFly, and the results show that GraphFly significantly outperforms state-of-the-art systems KickStarter and GraphBolt by 5.81 × and 1.78 × on average, respectively. Also, GraphFly scales well with different sizes of update batch and compute resources.
What problem does this paper attempt to address?