A Weighted K-means Gene Clustering Algorithm

Deng-ju YAO,Xiao-juan ZHAN,Xiao-jing ZHANG
DOI: https://doi.org/10.15938/j.jhust.2017.02.021
2017-01-01
Abstract:In view of the complex correlation between gene and gene in the microarray data set,a weighted K-mean gene clustering algorithm based on random forest variable importance score was proposed.First,the proposed algorithm begins with training random forest classifier on the microarray data,using the samples as objects and the genes as features,variable importance scores were calculated for each gene;then,a weighted K-means clustering were performed with genes as objects,samples as features,and variable importance score as weighted value.Experiments were carried out on Leukemia,Breast and DLBCL three datasets.The experimental results show that the proposed weighted K-mean clustering algorithm has an average of 17.7 percentage points higher than the original K-mean clustering algorithm with respective to the ratio of the distance between the class and the total distance and has better homogeneity and difference.
What problem does this paper attempt to address?