A heuristic approach for λ-representative information retrieval from large-scale data

Jin Zhang,Qiang Wei,Guoqing Chen
DOI: https://doi.org/10.1016/j.ins.2014.03.017
IF: 8.1
2014-01-01
Information Sciences
Abstract:Retrieving representative information from large-scale data becomes an important research issue nowadays, especially in the context of mobile business/search where the screen size and navigability are limited. This paper focuses on certain aspects of representativeness in database queries and web search, and proposes an approach to extracting a subset of results from original search results in light of high coverage and low redundancy. In the paper, the notion of λ-represent is introduced, which enables us to describe the λ-represent relationship between the sets of data objects. Then, the λ-representative problem is formulated as an extension of the typical set covering problem, which leads to developing a heuristic approach (namely, LamRep) to coping with the problem effectively and efficiently. Notably, LamRep is incorporated with a “vote” mechanism, enhanced with an algorithmic acceleration strategy. Data experiments on benchmark data and a real-world example show that LamRep outperforms the other approaches.
What problem does this paper attempt to address?