Soft Clustering Criterion Functions for Partitional Document Clustering: a Summary of Results

Ying Zhao,George Karypis
DOI: https://doi.org/10.1145/1031171.1031225
2004-01-01
Abstract:Recently published studies have shown that partitional clustering algorithms that optimize certain criterion functions, which measure key aspects of inter- and intra-cluster similarity, are very effective in producing hard clustering solutions for document datasets and outperform traditional partitional and agglomerative algorithms. In this paper we study the extent to which these criterion functions can be modified to include soft membership functions and whether or not the resulting soft clustering algorithms can further improve the clustering solutions. Specifically, we focus on four of these hard criterion functions, derive their soft-clustering extensions, and present an experimental evaluation involving twelve different datasets. Our results show that introducing softness into the criterion functions tends to lead to better clustering results for most datasets.
What problem does this paper attempt to address?