Feature Selection for Clustering on High Dimensional Data

Hong Zeng,Yiu-ming Cheung
DOI: https://doi.org/10.1007/978-3-540-89197-0_85
2008-01-01
Abstract:This paper addresses the problem of feature selection for the high dimensional data clustering. This is a difficult problem because the ground truth class labels that can guide the selection are unavailable in clustering. Besides, the data may have a large number of features and the irrelevant ones can ruin the clustering. In this paper, we propose a novel feature weighting scheme for a kernel based clustering criterion, in which the weight for each feature is a measure of its contribution to the clustering task. Accordingly, we give a well-defined objective function, which can be explicitly solved in an iterative way. Experimental results show the effectiveness of the proposed method.
What problem does this paper attempt to address?