Efficient Bulk Loading to Accelerate Spatial Keyword Queries
Dongsheng Li,Jinkun Pan,Jiaxin Li,Kian-Lee Tan,Dongxiang Zhang
DOI: https://doi.org/10.1109/icpads.2013.87
2013-01-01
Abstract:With the fast development of location-based services and geo-tagging, spatial keyword queries that retrieve objects satisfying both spatial and keyword conditions are gaining in prevalence. A hybrid index that integrates a spatial index (e.g., the R-tree or its variations) with a keyword filter offers a promising approach for processing such queries efficiently. However, it is still an open problem on how a hybrid index can be effectively constructed from scratch. The state-of-the-art bulk loading algorithms for the R-tree consider only spatial relationship, and cannot be employed for the hybrid index. In this paper, we propose a new bulk loading algorithm, named TPA, which constructs a hybrid index top-down. TPA utilizes a two-phase method to construct the children of nodes at each level of the hybrid index, taking both spatial and keyword information into consideration, and thus optimizes the hybrid index for spatial keyword queries. We analyze and evaluate its performance using both real and synthetic datasets. Comprehensive experiments show that TPA can achieve good performance and space utilization, reducing the construction time, the query latency and the index size remarkably.