Efficient processing of top-k queries: selective NRA algorithms

Jing Yuan,Guangzhong Sun,Tao Luo,Defu Lian,Guoliang Chen
DOI: https://doi.org/10.1007/s10844-012-0208-5
2012-01-01
Journal of Intelligent Information Systems
Abstract:Efficient processing of top- k queries has drawn increasing attention from both industry and academia due to its varied applications. Lower access cost is a crucial concern for a top- k query processing. Typically, when answering a top- k query, there exist two types of accesses: sorted access and random access. In some scenarios, the latter is not supported by the data source. Fagin et al. proposed the No Random Access (NRA) algorithm (Fagin et al, J Comput Syst Sci 66:614–656, 2003 ) for this situation. In this paper, we motivate our work by a key observation of the NRA algorithm: the number of accesses could be further reduced by selectively (instead of in parallel) performing sorted accesses to different lists of the dataset. Based on this insight, we propose a Selective NRA (SNRA) algorithm aiming to cut down the unnecessary access cost. Later, we optimize the SNRA algorithm in terms of runtime cost and present the SNRA-opt algorithm. Furthermore, we address the problem of instance optimality theoretically and turn SNRA (and SNRA-opt) into instance optimal algorithms, termed as Hybrid-SNRA (HSNRA) and HSNRA-opt. Extensive experimental results show that our algorithms perform significantly fewer sorted accesses than NRA (and its state-of-the-art variations). In terms of runtime cost, the proposed SNRA-opt and HSNRA-opt algorithms are two orders of magnitude faster than the NRA algorithm. In addition, we discuss the parameter selection problem of the SNRA algorithms, both theoretically and experimentally.
What problem does this paper attempt to address?