An Improved K-Means Clustering Algorithm Based on Spectral Method

Shengwen Tian,Hongyong Yang,Yilei Wang,Ali Li
DOI: https://doi.org/10.1007/978-3-540-92137-0_58
2008-01-01
Abstract:It is well known that K-means algorithm is very sensitive to outliers, and often terminates at a local optimum. Furthermore, it is necessary for K-means algorithm to determine the number K of clusters as a priori knowledge in advance. Therefore, the quality of the result is not satisfactory. In this paper, we develop an improved K-means clustering algorithm--NK-means. NK-means is based on spectral methods, namely uses Normal matrix that is used in spectral analysis approaches to normalize original datasets, and then finds clusters in the processed datasets by K-means algorithm. We also propose a measure for the strength of clusters structure found by NK-means algorithm, which gives us an objective metric for choosing the number K of clusters into which a data set should be divided. Experiment shows that NK-means algorithm significantly outperforms K-means in the efficiency and accuracy.
What problem does this paper attempt to address?