Discovering the Skyline of Subspace Clusters in High-Dimensional Data

Guanhua Chen,Xiuli Ma,Dongqing Yang,Shiwei Tang
DOI: https://doi.org/10.1109/FSKD.2008.489
2008-01-01
Abstract:Subspace clustering on high-dimensional datasets may often result in an undesirably large set of clusters due to the huge amount of possible subspaces. Such a large set of subspace clusters not only raises the cost of computation, but also weaken the understandability of the results. Both of the two problems reduce the usability of the subspace clustering in the real applications. In this paper, we propose a new approach of applying skyline query into the subspace clustering process, for avoiding redundant subspace clusters by the dominating relationship, which is characterized as mining the skyline of subspace clusters. Two algorithms, SkyClu-CBC and SkyClu-IBC, are proposed. Experiments on real and synthetic datasets are carried out to show the effectiveness and efficiency of the proposed methods.
What problem does this paper attempt to address?