Optimal Top-K Generation of Attribute Combinations Based on Ranked Lists

Jiaheng Lu,Pierre Senellart,Chunbin Lin,Xiaoyong Du,Shan Wang,Xinxing Chen
DOI: https://doi.org/10.1145/2213836.2213883
2012-01-01
Abstract:In this work, we study a novel query type, called top- k,m queries. Suppose we are given a set of groups and each group contains a set of attributes, each of which is associated with a ranked list of tuples, with ID and score. All lists are ranked in decreasing order of the scores of tuples. We are interested in finding the best combinations of attributes, each combination involving one attribute from each group. More specifically, we want the top- k combinations of attributes according to the corresponding top-m tuples with matching IDs. This problem has a wide range of applications from databases to search engines on traditional and non-traditional types of data (relational data, XML, text, etc.). We show that a straightforward extension of an optimal top- k algorithm, the Threshold Algorithm (TA), has shortcomings in solving the km problem, as it needs to compute a large number of intermediate results for each combination and reads moreinputs than needed. To overcome this weakness, we provide here, for the first time, a provably instance-optimal algorithm and further develop optimizations for efficient query evaluation to reduce computational and memory costs and the number of accesses. We demonstrate experimentally the scalability and efficiency of our algorithms over three real applications.
What problem does this paper attempt to address?