GraphA: Efficient Partitioning and Storage for Distributed Graph Computation

Yiming Zhang,Dongsheng Li,Chengfei Zhang,Jinyan Wang,Ling Liu
DOI: https://doi.org/10.1109/tsc.2017.2778737
IF: 11.019
2019-01-01
IEEE Transactions on Services Computing
Abstract:Distributed graph computation is central to applications ranging from language processing to social networks. However, natural graphs tend to have skewed power-law distrbutions where a small subset of the vertices have a large number of neighbors. Existing graph-parallel systems suffer from load imbalance, high communication cost, and inefficient processing. To address this problem, in this paper we present GraphA, an adaptive scheme for efficient large-scale graph computation. At the core of GraphA is an adaptive and uniform graph partitioning algorithm, which partitions the datasets by using an incremental number of mapping functions. GraphA further improves and leverages the ART index structure to realize fine-grained and low-cost graph storage. We have implemented GraphA both on Spark and on GraphLab. Extensive evaluation shows that GraphA significantly outperforms state-of-the-art graph-parallel systems (GraphX and PowerLyra) in ingress time, execution time and storage cost, for both real-world and synthetic graphs. GraphA achieves up to 7.1 x performance improvement over GraphX and 19.7 percent improvement over PowerLyra.
What problem does this paper attempt to address?