Efficient K-buckets Skyline Query Algorithm on High-dimensional sparse data

Yanyan XU,Hongzhi WANG,Hong GAO,Jianzhong LI
DOI: https://doi.org/10.19335/j.cnki.2095-6649.2012.08.007
2012-01-01
Abstract:The skyline query on High-dimensional data , a multi-attribute data objects, becomes a research hotspot. Traditional dimensionality reduction and k-dominant skyline query on high-dimensional data are based on the suggestions that data objects are both complete and accurate. However the high-dimensional data objects (especially the collection of network data) are often incomplete in practical implementation. The bucket algorithm designed for incomplete data has the drawbacks that the number of buckets increases exponentially with the increase of dimensionality, which will cause serious waste of storage space. The paper proposed a new concept of high dimensional k-dominant. And an efficient k-buckets skyline query algorithm is proposed to solve the problems of high dimensional sparse data. The algorithm can effectively control the number of buckets and reduces the size of the set of candidate skylines. The experiment verifies that this new algorithm is especially suitable for high-dimensional sparse data. The Sparse of the data is higher and the advantages of the algorithm is more obvious.
What problem does this paper attempt to address?