Skia: Scalable and Efficient In-Memory Analytics for Big Spatial-Textual Data

Yang Xu,Bin Yao,Zhi-Jie Wang,Xiaofeng Gao,Jiong Xie,Minyi Guo
DOI: https://doi.org/10.1109/tkde.2019.2915828
IF: 9.235
2020-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:In recent years, spatial-keyword queries have attracted much attention with the fast development of location-based services. However, current spatial-keyword techniques are disk-based, which cannot fulfill the requirements of high throughput and low response time. With the surging data size, people tend to process data in distributed in-memory environments to achieve low latency. In this paper, we present the distributed solution, i.e., Skia (Spatial-Keyword In-memory Analytics), to provide a scalable backend for spatial-textual analytics. Skia introduces a two-level index framework for big spatial-textual data including: (1) efficient and scalable global index, which prunes the candidate partitions a lot while achieving small space budget; and (2) four novel local indexes, that further support low latency services for exact and approximate spatial-keyword queries. Skia can support common spatial-keyword queries via traditional SQL programming interfaces. The experiments conducted on large-scale real datasets have demonstrated the promising performance of the proposed indexes and our distributed solution.
What problem does this paper attempt to address?