Parallelized Similarity Flooding Algorithm for Processing Large Scale Graph Datasets with MapReduce

Jian Zhang,Chunfeng Yuan,Yihua Huang
DOI: https://doi.org/10.1109/PDCAT.2012.109
2012-01-01
Abstract:Measures of graph similarity have a broad range of applications but involve compute-intensive process. Similarity flooding algorithm is an efficient algorithm for comparing the similarity of graphs of small size and small datasets. However, nowadays more and more large-scale graph applications emerge and existing stand-alone similarity flooding algorithm cannot efficiently conduct the similarity comparison process for large scale graph datasets in acceptable time. This paper presents a parallelized similarity flooding algorithm with MapReduce for large-scale graph datasets. The experimental results demonstrate that the parallelized algorithm achieves significant performance improvement compared to the stand-alone similarity flooding algorithm. Experimental results also reveal that the parallelized algorithm can obtain excellent speedup when the size of cluster increases.
What problem does this paper attempt to address?