An Improved Clustering Method Based on Data Field

Yu Liu,Cui Xu,Ke Xu,Jianzhi Jin
DOI: https://doi.org/10.4028/www.scientific.net/AMM.457-458.919
2013-10-01
Abstract:By analyzing the problem of k-means, we find the traditional k-means algorithm suffers from some shortcomings, such as requiring the user to give out the number of clusters k in advance, being sensitive to the initial cluster centers, being sensitive to the noise and isolated data, only being applied to the type found in globular clusters, and being easily trapped into a local solution et cetera. This improved algorithm uses the potential of data to find the center data and eliminate the noise data. It decomposes big or extended cluster into several small clusters, then merges adjacent small clusters into a big cluster using the information provided by the Safety Area. Experimental results demonstrate that the improved k-means algorithm can determine the number of clusters, distinguish irregular cluster to a certain extent, decrease the dependence on the initial cluster centers, eliminate the effects of the noise data and get a better clustering accuracy.
Computer Science
What problem does this paper attempt to address?