Application of an improved K-means algorithm in gene expression data analysis

Qian Ren,Xinjian Zhuo
DOI: https://doi.org/10.1109/ISB.2011.6033126
2011-01-01
Abstract:K-means algorithm is one of the most classic partition algorithms in clustering algorithms. The result obtained by K-means algorithm varies with the choice of the initial clustering centers. Motivated by this, an improved K-means algorithm is proposed based on the Kruskal algorithm, which is famous in graph theory. The procedure of this algorithm is shown as follows: Firstly, the minimum spanning tree (MST) of the clustered objects is obtained by using Kruskal algorithm. Then K-1 edges are deleted based on weights in a descending order. At last, the average values of the objects contained by the k-connected graphs resulting from last two steps are regarded as the initial clustering centers to cluster. Make the improved K-means algorithm used in gene expression data analysis, simulation experiment shows that the improved K-means algorithm has a better clustering effect and higher efficiency than the traditional one. © 2011 IEEE.
What problem does this paper attempt to address?