ListMerge: Accelerating Top-k Aggregation Queries over Large Number of Lists.

Shile Zhang,Chao Sun,Zhenying He
DOI: https://doi.org/10.1007/978-3-319-32049-6_5
2016-01-01
Abstract:Sorted list is widely used to feature indexing in a variety of applications, such as multimedia database and information retrieval. Answering top-k aggregation queries on a set of lists plays an increasingly important role in these domains. Unfortunately the existing solutions, such as threshold-style (TA-style) algorithms, do not guarantee superior performance on a large number of lists. In this paper, we introduce a merge-based strategy, called ListMerge, to accelerating TA-style algorithms. ListMerge exploits a critical observation to TA-style algorithms: if aggregation functions are monotone and distributive, it is much more efficient that merging several lists together, then applying a TA-style algorithm. This observation also inspires the development of our cost model, which can evaluate the best number of merged lists. Experimental results show that ListMerge could outperform the baseline algorithms up to 4–20 times in synthetic datasets generated by various distributions.
What problem does this paper attempt to address?