Balancing Memory Accesses for Energy-Efficient Graph Analytics Accelerators.

Mingyu Yan,Xing Hu,Shuangchen Li,Itir Akgun,Han Li,Xin Ma,Lei Deng,Xiaochun Ye,Zhimin Zhang,Dongrui Fan,Yuan Xie
DOI: https://doi.org/10.1109/islped.2019.8824832
2019-01-01
Abstract:Domain-specific accelerators for graph analytics leverage a large on-chip memory in order to tackle the intensive random memory accesses, offering higher performance and energy efficiency than conventional architectures. However, limited by the inefficient usage of on-chip memory, current accelerators suffer from energy and performance bottlenecks due to the large amount of off-chip memory accesses. In this work, we introduce an online preprocessing step for the vertex-centric programming model based on our observation of imbalanced memory bandwidth utilization between two execution phases. Our scheme improves energy efficiency and performance by significantly reducing off-chip accesses in two ways. First, we sequence random off-chip memory accesses to balance memory bandwidth demands and improve the utilization of on-chip memory. Second, we prune active leaf vertices to avoid redundant memory accesses. We evaluate our method on a state-of-the-art graph analytics accelerator and achieve 1.6× speedup while reducing energy consumption by 42% on average.
What problem does this paper attempt to address?