Proxy-Guided Load Balancing of Graph Processing Workloads on Heterogeneous Clusters

Shuang Song,Meng Li,Xinnian Zheng,Michael LeBeane,Jee Ho Ryoo,Reena Panda,Andreas Gerstlauer,Lizy K. John
DOI: https://doi.org/10.1109/icpp.2016.16
2016-01-01
Abstract:Big data decision-making techniques take advantage of large-scale data to extract important insights from them. One of the most important classes of such techniques falls in the domain of graph applications, where data segments and their inherent relationships are represented as vertices and edges. Efficiently processing large-scale graphs involves many subtle tradeoffs and is still regarded as an open-ended problem. Furthermore, as modern data centers move towards increased heterogeneity, the traditional assumption of homogeneous environments in current graph processing frameworks is no longer valid. Prior work estimates the graph processing power of heterogeneous machines by simply reading hardware configurations, which leads to suboptimal load balancing. In this paper, we propose a profiling methodology leveraging synthetic graphs for capturing a node's computational capability and guiding graph partitioning in heterogeneous environments with minimal overheads. We show that by sampling the execution of applications on synthetic graphs following a power-law distribution, the computing capabilities of heterogeneous clusters can be captured accurately (<;10% error). Our proxy-guided graph processing system results in a maximum speedup of 1.84x and 1.45x over a default system and prior work, respectively. On average, it achieves 17.9% performance improvement and 14.6% energy reduction as compared to prior heterogeneity-aware work.
What problem does this paper attempt to address?