Adaptive Dimension Reduction for Clustering High Dimensional Data

C Ding,XF He,HY Zha,HD Simon
DOI: https://doi.org/10.1109/icdm.2002.1183897
2002-01-01
Abstract:It is well-known that for high dimensional data clustering, standard algorithms such as EM and K-means are often trapped in a local minimum. Many initialization methods have been proposed to tackle this problem, with only limited success. In this paper we propose a new approach to resolve this problem by repeated dimension reductions such that K-means or EM are performed only in very low dimensions. Cluster membership is utilized as a bridge between the reduced dimensional subspace and the original space, providing flexibility and ease of implementation. Clustering analysis performed on highly overlapped Gaussians, DNA gene expression profiles and Internet newsgroups demonstrate the effectiveness of the proposed algorithm.
What problem does this paper attempt to address?