IncGraph: An Improved Distributed Incremental Graph Computing Model and Framework based on Spark GraphX

Zhuo Tang,Mengsi He,Zhongming Fu,Li Yang
DOI: https://doi.org/10.1109/tkde.2020.3014150
IF: 9.235
2021-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:The excavated information will become obsolete when the data changes in dynamic graphs. To compute the up-to-date results, the graph algorithm has to re-compute the entire data from scratch, which will consume huge computation time and resources. To reduce the cost of such calculations, this paper proposes a model called IncGraph to support incremental iterative computation over dynamic graphs. Different from the way of traditional iteration, IncGraph executes the graph algorithm through reusing the results of the previous graph and performs computation on the part of the graph that has changed. IncGraph has two critical components: (1) an incremental iterative computation model that consists of two steps: an incremental step to calculate the results on the changed vertices of the graph, and a merge step to calculate the results on the entire graph by using the results of the previous graph and the incremental step; and (2) an incremental update method to accelerate the iterative process within the iterative graph algorithm. We implement IncGraph model on GraphX and evaluate its performance by using several representative iterative graph algorithms: PageRank, Connected components, and Single Source Shortest Path. The results show that compared with the traditional iteration, when adding the 100k of vertices in different size data sets, the performance optimization ratio of IncGraph is 31.79 percent averagely, and 50.2 percent maximum; and when the percentage of added vertices varied from 0.01 to 10 percent in different data sets, the performance optimization ratio of IncGraph varied from 19.9 to 66.1 percent. Moreover, the result errors of IncGraph is small and can be neglected.
computer science, information systems, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?