CrossIndex: Memory-Friendly and Session-Aware Index for Supporting Crossfilter in Interactive Data Exploration

Tianyu Xia,Hanbing Zhang,Yinan Jing,Zhenying He,Kai Zhang,X. Sean Wang
DOI: https://doi.org/10.1007/978-3-031-00123-9_38
2022-01-01
Abstract:Crossfilter, a typical application for interactive data exploration (IDE), is widely used in data analysis, BI, and other fields. However, with the scale-up of the dataset, the real-time response of crossfilter can be hardly fulfilled. In this paper, we propose a memory-friendly and session-aware index called CrossIndex, which can support crossfilterstyle queries with low latency. We first analyze a large number of query workloads generated by previous work and find that queries in the data exploration workload are inter-dependent, which means these queries have overlapped predicates. Based on this observation, this paper defines the inter-dependent queries as a session and builds a hierarchical index that can be used to accelerate crossfilter-style query processing by utilizing the overlapped property of the session to reduce unnecessary search space. Extensive experiments show that CrossIndex outperforms almost all other approaches and meanwhile keeps a low building cost.
What problem does this paper attempt to address?