Superblock: An Application-Aware Dynamic Partition Strategy for Large-Scale Graph

Junxue Zhang,Fang Dong,Dian Shea,Jiahui Jin,Junzhou Luo,Dian Shen
DOI: https://doi.org/10.1109/cbd.2015.35
2015-10-01
Abstract:The emergence of large-scale graph data has posed essential challenges for processing them efficiently. The fundamental step for effectively processing the graph is to partition the graph and distribute the relevant parts on multiple workers for parallel computing. The existing partition strategies may suffer from the following problems: 1) They ignore the certain application features, making the partition not satisfy the application needs, which may cause performance degradation, 2) Because of the ignorance of the applications features, current partition strategies are not dynamic to meet the needs from different applications. In this paper, the Superblock partition strategy, an application-aware dynamic partition strategy for large-scale data is proposed to solve the above problems. It pre-partitions the graph into blocks and then extracts the application features and combines the blocks into Superblocks. The Superblock will be re-constructed when new application arrives as well. Experiments are performed using some common graph algorithms to confirm that the Superblock partition strategy can boost the performance of various data processing application on large-scale graph data and be dynamic enough to alter the partitions for different applications.
What problem does this paper attempt to address?