Graph Processing Framework Supporting Elastic Scalability in Distributed Shared Environment

Hanglong ZHAN,Donggang CAO,Bing XIE
DOI: https://doi.org/10.3778/j.issn.1673-9418.1509009
2016-01-01
Abstract:As an important pattern in big data processing, graph processing has been widely used in many kinds of sce-narios, such as machine learning, data statistics and data mining, etc. when running enterprise-level applications, vari-ous kinds of big-data processing frameworks are usually deployed in the same distributed cluster, so the runtime environment is open and shared. As a result, graph processing should consider the dynamic changes of computing resources. In order to adapt to this dynamics and make good use of computing resources, graph processing framework should have the ability of elastic scaling. However, current graph processing frameworks have not fully realized elastic scaling yet as far as this paper knows. This paper introduces the design and implementation of an elastic scalable parallel graph processing framework, SParTaG. SParTaG firstly defines the task set and task model in graph processing prob-lem;then designs an elastic scalable framework based on task migration mechanism;and proposes a load-balancing based scheduling algorithm at last. Experiments show that SParTaG achieves performance parity with the currently popular open-source Giraph system, and it can run graph job well in an elastic scalable manner.
What problem does this paper attempt to address?