Cluster Center Initialization Parallel Algorithm for K-Means Algorithm

Bo Yan,Ye Zhang,Hong Yi Su,Hong Zheng
DOI: https://doi.org/10.4028/www.scientific.net/amr.989-994.2169
2014-01-01
Advanced Materials Research
Abstract:K-Means algorithm is a one of the most famous unsupervised clustering algorithm. It has many disadvantages, such as sensitivity to the initial clustering centers and computes all the data points multiple times when facing the increasing data volume. In order to overcome the above limitations, this paper proposes to make use of density idea to find k cluster centers by adjusting the threshold. Finally, we design and implementation of the K-Means algorithm on the modern Graphic Processing Unit (GPU). The ratio of distance between classes to distance within classes and speedup are used as evaluation criteria. The experiments indicate that the proposed algorithm significantly improves the stability and efficiency of K-Means algorithm.
What problem does this paper attempt to address?