Experimental evaluation of selectivity estimation on big spatial data

Harry Chasparis,Ahmed Eldawy
DOI: https://doi.org/10.1145/3080546.3080553
2017-05-14
Abstract:With the tremendous volume of spatial datasets, there is an increasing need to process and analyze spatial data. One of the fundamental spatial queries is the selectivity estimation problem where users want to quickly estimate the total number of records in a given query range. While there have been several approaches to solve this problem for big data, there is no systematic evaluation and comparison for these techniques. In this work, we experimentally examine three of the most widely used techniques for selectivity estimation, namely, sampling, uniform binning, and non-uniform binning. This evaluation will be a basis for deciding when to use each of these techniques based on the application requirements. Furthermore, we study the trade-off between memory usage, preprocessing overhead, online query time and the accuracy of the results. With extensive experiments on large datasets, we provide an evaluation of these techniques and we reveal their benefits and their weaknesses.
What problem does this paper attempt to address?