Automatic Parallelization of Graph Queries with MapReduce

Tao Zan,Yu Liu,Zhenjiang Hu
2012-01-01
Abstract:The MapReduce programming model has gained attraction for large scaled data processing in recent years. One typical application is graph query parallelization. Although a lot of work has been done to parallel graph queries by MapReduce, the data model used for graphs does not reflect the original graph shape, thus making it difficult to manipulate graphs in a structured way. In this paper, we show a new approach to parallelization of graph queries in UnQL, a graph query language whose data model is an edge labelled graph. The most prominent feature of this language is that all UnQL queries can be specified by structural recursion which can be evaluated in a bulk manner. After carefully examining the properties of structural recursion, we propose a natural way for parallelizing structural recursion by MapReduce automatically in two steps: 1) bulk computing and 2) unreachable part removal and global ε-edge elimination. The experiments show the feasibility of our approach for parallelization of graph queries.
What problem does this paper attempt to address?