Efficient Approaches to K Representative G-Skyline Queries

Xu Zhou,Kenli Li,Zhibang Yang,Yunjun Gao,Keqin Li
DOI: https://doi.org/10.1145/3397503
IF: 4.157
2020-01-01
ACM Transactions on Knowledge Discovery from Data
Abstract:The G-Skyline (GSky) query is a powerful tool to analyze optimal groups in decision support. Compared with other group skyline queries, it releases users from providing an aggregate function. Besides, it can get much comprehensive results without overlooking some important results containing non-skylines. However, it is hard for the users to make sensible choices when facing so many results the GSky query returns, especially over a large, high-dimensional dataset or with a large group size. In this article, we investigate k representative G-Skyline (kGSky) queries to obtain a manageable size of optimal groups. The kGSky query can also inherit the advantage of the GSky query; its results are representative and diversified. Next, we propose three exact algorithms with novel techniques including an upper bound pruning, a grouping strategy, a layered optimum strategy, and a hybrid strategy to efficiently process the kGSky query. Consider these exact algorithms have high time complexity and the precise results are not necessary in many applications. We further develop two approximate algorithms to trade off some accuracy for efficiency. Extensive experiments on both real and synthetic datasets demonstrate the efficiency, scalability, and accuracy of the proposed algorithms.
What problem does this paper attempt to address?