Effective Runtime Scheduling for High-Performance Graph Processing on Heterogeneous Dataflow Architecture

Chen Qingxiang,Zheng Long,Liao Xiaofei,Jin Hai,Wang Qinggang
DOI: https://doi.org/10.1007/s42514-020-00041-w
2020-01-01
CCF Transactions on High Performance Computing
Abstract:Graph processing is widely used in modern society, such as social networks, bioinformatics, and information networks. It is observed that the dataflow architecture has been demonstrated to effectively resolve the challenges of low instruction-level parallelism and branch mispredictions in the existing general-purpose architecture for graph applications. In this paper, toward a customized heterogeneous dataflow architecture that integrates the hardware advantages of both dataflow architecture and traditional control architecture, we propose a novel runtime system that can adaptively offload each subgraph to an appropriate underlying architecture. We also present a hybrid execution model to drive optimal performance. Our implementation on a CPU-FPGA platform shows that our approach achieves 2.2x throughput improvement over a state-of-art CPU-FPGA graph processing accelerator and 2.4x throughput improvement over a state-of-art FPGA-based design.
What problem does this paper attempt to address?