Enabling Application-Aware Flexible Graph Partition Mechanism for Parallel Graph Processing Systems.

Fang Dong,Junxue Zhang,Junzhou Luo,Dian Shen,Jiahui Jin
DOI: https://doi.org/10.1002/cpe.3849
2016-01-01
Concurrency and Computation Practice and Experience
Abstract:With the emerging of the large-scale graph data, Pregel-like graph parallel processing systems have been an essential tool to efficiently process the graph data. The first step to use the Pregel-like systems is to partition the graph into multiple blocks and distribute them on multiple machines. The partition strategy plays a significant role in determining the performance because a good partition could both ensure load balance and optimize network communication overhead, and vice versa. However, existing partition strategies fail to meet the requirements because they suffer from the following drawbacks: (1) they ignore the application features and (2) they ignore the multi-application feature in productive environment. To overcome those drawbacks, we proposed the superblock partition strategy, which utilizes the atomic blocks generated by pre-processing of the original graph and could be constructed and re-constructed dynamically according to the submitted applications in real time. The hash-based and clustering-based pre-partition methods are covered in details. The application feature extraction method and heuristic superblock partition algorithm are proposed to construct the superblocks. Experimental results show that the superblock partition strategy could boost the graph processing performance and its partition efficiency also outperforms the hash-based and topology optimal partition strategy. Copyright (C) 2016 John Wiley & Sons, Ltd.
What problem does this paper attempt to address?