Study on Partitioning Real-World Directed Graphs of Skewed Degree Distribution

Jie Yan,Guangming Tan,Ninghui Sun
DOI: https://doi.org/10.1109/icpp.2015.37
2015-01-01
Abstract:Distributed computation on directed graphs has been increasingly important in emerging big data analytics. However, partitioning the huge real-world graphs, such as social and web networks, is known challenging for their skewed (or power-law) degree distributions. In this paper, by investigating two representative k-way balanced edge-cut methods (LDG streaming heuristic and METIS) on 12 real social and web graphs, we empirically find that both LDG and METIS can partition page-level web graphs with extremely high quality, but fail to generate low-cut balanced partitions for social networks and host-level web graphs. Our deep analysis identifies that the global star-motif structures around high-degree vertices is the main obstacle to high-quality partitioning. Based on the empirical study, we further propose a new distributed graph model, namely Agent-Graph , and the Agent+ framework that partitions power-law graphs in the Agent-Graph model. Agent-Graph is a vertex cut variant in the context of message passing, where any high-degree vertex is factored into arbitrary computational agents in remote partitions for message combining and scattering. The Agent framework filters the high-degree vertices to form a residual graph which is then partitioned with high quality by existing edge-cut methods, and finally refills high-degree vertices as agents to construct an agent-graph. Experiments show that the Agent+ approach constantly generates high-quality partitions for all tested real-world skewed graphs. In particular, for 64-way partitioning on social networks and host-level web graphs, the Agent+ approach reduces edge cut equivalently by 27%~79% for LDG and 23%~82% for METIS.
What problem does this paper attempt to address?