Efficient Spatio-textual Similarity Join Using MapReduce

Yu Zhang,Youzhong Ma,Xiaofeng Meng
DOI: https://doi.org/10.1109/wi-iat.2014.16
2014-01-01
Abstract:Spatio-textual similarity join is a basic and significant operation in many applications. It is an operation that finds all the similar pairs of objects which have similar textual descriptions and are spatially close to each other. With the popularity of GPS and their applications, the size of spatiotextual data is increasing explosively, while the existing methods cannot deal with the spatio-textual similarity join efficiently on massive data. In this paper, we propose several approaches for spatio-textual similarity join using MapReduce. We use the prefix filtering and grid partitioning techniques to filter the spatiotextual objects under the filter-and-refine framework. Besides, we propose two kinds of optimization methods to improve the efficiency of the basic spatio-textual similarity join method. In the end, we conduct extensive experiments using several synthetic datasets that are comprised of real datasets, and the results show that our approaches have good performance in both efficiency and scalability.
What problem does this paper attempt to address?