Study on Dynamic Clustering Analysis Method for Gene Expression Data Based on Multidimension Pseudo F-statistics

Jia-wei LUO,Ren-fa LI,Bai-ni ZHANG
DOI: https://doi.org/10.3969/j.issn.1004-731X.2006.03.016
2006-01-01
Abstract:K-means clustering analysis algorithm is a widely iterated algorithm in clustering analysis of gene expression data. In this algorithm, cluster number is assumed to be K and iterated methods are employed to make the value of objective function minimum. By doing so, the cluster result improves very much. However K-means clustering analysis algorithm depends on parameters strongly and the cluster number keeps unchanged. Fake F-statistic and an idea of adjusting cluster number were dynamically introduced, and then a new dynamic K-means clustering algorithm for Genes expressed data was proposed based on multi-dimension fake F-statistic. The experiment results show that the algorithm can adjust cluster number and gain a prime number of clustering, which thus argues that this algorithm can attain better clustering quality.
What problem does this paper attempt to address?