GCPR:an Acceleration Method for PageRank Based on Graph-clustering on MapReduce

LIAO Song-bo,TAO Yue,HE Zhen-ying,WANG Wei
DOI: https://doi.org/10.3969/j.issn.1000-1220.2012.06.008
2012-01-01
Abstract:As various applications spring up,the uses of large-scale graphs mushroom.How to analyze the graphs of abundant nodes draws the attention of researchers.The magnanimity of nodes and the complexity of the analysis make the task of analyzing the large-scale graphs resort to MapReduce for parallel computing on the distributed system.On MapReduce,the classical PageRank algorithm calls for scanning and transferring the entire state of the graph at each iteration.The cost of I/O and network transmitting increases the total time of computing.Given this problem,this paper proposes an algorithm with better efficiency for PageRank based on Graph-clustering on MapReduce: GCPR,which makes use of graph-clustering and twice compression.By means of GCPR,the cost of I/O and network transmitting between Map and Reduce(the major bottleneck of MapReduce) has been lessened and the computational resources have been balanced.Experiments demonstrate that GCPR could greatly enhance the computing efficiency of PageRank on MapReduce.
What problem does this paper attempt to address?