Keyword Search over Web Documents Based on Earth Mover's Distance.

Jiangang Ma,Quan Z. Sheng,Lina Yao,Yong Xu,Ali Shemshadi
DOI: https://doi.org/10.1007/978-3-319-11749-2_20
2014-01-01
Abstract:Keyword search is widely used in many practical applications. Unfortunately, most keyword-based search engines compute the similarity distance between two Web documents by only matching the keywords at the same positions in both the query and the document vectors, without considering the impact of the keywords at neighbouring positions. Such approach usually results in incompleteness of search results. In this paper, we exploit the Earth Mover's Distance (EMD) as a distance function, which is more flexible against other distance functions such as Euclidean distance. To overcome the limitation of EMD-based computation complexity, we use the filtering techniques to minimize the total number of actual EMD computations. We further develop a novel lower bound as a new EMD filter for partial matching technique that is suitable for searching Web documents. The experimental results demonstrate the efficiency of EMD-based search with filtering techniques.
What problem does this paper attempt to address?