TuNao: A High-Performance and Energy-Efficient Reconfigurable Accelerator for Graph Processing

Jinhong Zhou,Shaoli Liu,Qi Guo,Xuda Zhou,Tian Zhi,Daofu Liu,Chao Wang,Xuehai Zhou,Yunji Chen,Tianshi Chen
DOI: https://doi.org/10.1109/ccgrid.2017.114
2017-01-01
Abstract:Large-scale graph processing is now a crucial task of many commercial applications, and it is conventionally supported by general-purpose processors. These processors are designed to flexibly support highly diverse workloads with classic techniques such as on-chip cache and dynamic pipelining. Yet, it is difficult for the on-chip cache to exploit irregular data locality in large-scale graph processing, even though there are a few high-degree vertices that are frequently accessed in real-world graphs, it is not efficient to perform regular arithmetic operations via sophisticated dynamic pipelining. In short, general-purpose processors could not be the ideal platforms to graph processing. In this paper, we design a reconfigurable graph processing accelerator, with the purpose of providing an energy-efficient and flexible hardware platform for large-scale graph processing. This accelerator features two main components, i.e., the on-chip storage to exploit the data locality of graph processing, and the reconfigurable functional units to adapt to diversified operations in different graph processing tasks. On a total of 36 practical graph processing tasks, we demonstrate that, on average, our accelerator design achieves 1.58x and 25.56x better performance and energy efficiency, respectively, than the GPU baseline.
What problem does this paper attempt to address?