Towards memory and computation efficient graph processing on spark

Xinhui Tian,Yuanqing Guo,Jianfeng Zhan,Lei Wang
DOI: https://doi.org/10.1109/BigData.2017.8257948
2017-12-01
Abstract:Algorithms for large scale natural graph processing can be categorized into two types based on their value propagation behaviors: the unidirectional value propagation (UVP) algorithms and the bidirectional value propagation (BVP) algorithms. The behavior about how vertices interact with neighbors also differs between two algorithm types, which demands different system design choices. However, current distributed graph processing systems usually try to support both types in one general-purpose framework Such system design can not promise good performance and low resource consumption for both types. Especially, for UVP algorithms, current systems can not guarantee low memory footprint, computation efficiency and communication efficiency at the same time. In this paper, we propose a new graph processing engine on Spark, GraphV, which is specially designed for the unidirectional value propagation algorithms, and can satisfy all the above requirements for this type of algorithms. To retain the generalization for other algorithms, we also build a dual-engine framework by integrating GraphV with Spark's existing graph processing engine GraphX. The main design choices of GraphV include a cheap propagation-related partitioner, an one-step computation model, and a locality-aware local graph layout. According to the experiment results, GraphV is faster than GraphX by the factors of 1.2x–3.1x, with much less resource consumption. The source code of GraphV will be publicly available from http://prof.ict.ac.cn/GraphV.
Computer Science
What problem does this paper attempt to address?