Chunk-oriented dimension ordering for efficient range query processing on sparse multidimensional data

Shuai Han,Xianmin Liu,Jianzhong Li
DOI: https://doi.org/10.1007/s11280-022-01098-z
2022-01-01
Abstract:Range query processing is of vital importance in array management area. How to achieve efficient range query evaluation is challenging on sparse multidimensional data in many applications. The range query performance is seriously affected by the dimension order utilized, such that it is highly needed to optimize the dimension order for the query performance. Prior works only focus on optimizing the global dimension order for the data. However, the data distribution and the query distribution on different parts of data may differ with each other. The global dimension order is too coarse-grained to achieve good query performance. It is essential to develop a fine-grained dimension order optimization. In this paper, to exploit the optimizing opportunities of fine-grained dimension ordering for range query processing, we first design a two-level linearization method for storing and querying the sparse multidimensional data. Different from previous works which usually use a global dimension order, the two-level linearization method allows to separately specify the dimension orders for different parts of data, named chunks. To achieve the effect of the fine-grained dimension order optimization, we present the chunk-oriented dimension ordering problem for the first time, and propose the workload-driven dimension ordering algorithms for the uniform case and the non-uniform independent case respectively. Furthermore, to cope with the changing workload in practical applications, a dynamic dimension reordering method is designed to trace query trends in time and avoid query performance degradation. Finally, experiments are constructed on both synthetic and real-life data to illustrate the effectiveness of our method.
What problem does this paper attempt to address?