Correlation Clustering Based on Genetic Algorithm for Documents Clustering

Zhenya Zhang,Hongmei Cheng,Wanli Chen,Shuguang Zhang,Qiansheng Fang
DOI: https://doi.org/10.1109/cec.2008.4631230
2008-01-01
Abstract:Correlation clustering problem is a NP hard problem and technologies for the solving of correlation clustering problem can be used to cluster given data set with relation matrix for data in the given data set. In this paper, an approach based on genetic algorithm for correlation clustering problem, named as GeneticCC, is presented. To estimate the performance of a clustering division, data correlation based clustering precision is defined and features of clustering precision are discussed in this paper. Experimental results show that the performance of clustering division for UCI document data set constructed by GeneticCC is better than clustering performance of other clustering divisions constructed by SOM neural network with clustering precision as criterion.
What problem does this paper attempt to address?