A Weighted Adaptive Mean Shift Clustering Algorithm.

Ya-Zhou Ren,Carlotta Domeniconi,Guoji Zhang,Guo-Xian Yu
DOI: https://doi.org/10.1137/1.9781611973440.91
2014-01-01
Abstract:The mean shift algorithm is a nonparametric clustering technique that does not make assumptions on the number of clusters and on their shapes. It achieves this goal by performing kernel density estimation, and iteratively locating the local maxima of the kernel mixture. The set of points that converge to the same mode defines a cluster. While appealing, the performance of the mean shift algorithm significantly deteriorates with high dimensional data due to the sparsity of the input space. In addition, noisy features can create challenges for the mean shift procedure. In this paper we extend the mean shift algorithm to overcome these limitations, while maintaining its desirable properties. To achieve this goal, we first estimate the relevant subspace for each data point, and then embed such information within the mean shift algorithm, thus avoiding computing distances in the full dimensional input space. The resulting approach achieves the best-of-two-worlds: effective management of high dimensional data and noisy features, while preserving a nonparametric nature. Our approach can also be combined with random sampling to speedup the clustering process with large scale data, without sacrificing accuracy. Extensive experimental results on both synthetic and real-world data demonstrate the effectiveness of the proposed method.
What problem does this paper attempt to address?