The Comparison of SOM and K-means for Text Clustering.

Yiheng Chen,Bing Qin,Ting Liu,Yuanchao Liu,Sheng Li
DOI: https://doi.org/10.5539/cis.v3n2p268
2010-01-01
Abstract:SOM and k-means are two classical methods for text clustering.In this paper some experiments have been done to compare their performances.The sample data used is 420 articles which come from different topics.K-means method is simple and easy to implement; the structure of SOM is relatively complex, but the clustering results are more visual and easy to comprehend.The comparison results also show that k-means is sensitive to initiative distribution, whereas the overall clustering performance of SOM is better than that of k-means, and it also performs well for detection of noisy documents and topology preservation, thus make it more suitable for some applications such as navigation of document collection, multi-document summarization and etc. whereas the clustering results of SOM is sensitive to output layer topology.
What problem does this paper attempt to address?