A Comparative Study of A Practical Stochastic Clustering Method with Traditional Methods

Swee Chuan Tan,Kai Ming Ting,Shyh Wei Teng
DOI: https://doi.org/10.1007/978-3-642-17432-2_12
2010-01-01
Abstract:In many real-world clustering problems, there usually exist little information about the clusters underlying a certain dataset. For example, the number of clusters hidden in many datasets is usually not known a priori. This is an issue because many traditional clustering methods require such information as input. This paper examines a practical stochastic clustering method (PSCM) that has the ability to find clusters in datasets without requiring users to specify the centroids or the number of clusters. By comparing with traditional methods (k-means, self-organising map and hierarchical clustering methods), the performance of PSCM is found to be robust against overlapping clusters and clusters with uneven sizes. The proposed method also scales well with datasets having varying number of clusters and dimensions. Finally, our experimental results on real-world data confirm that the proposed method performs competitively against the traditional clustering methods in terms of clustering accuracy and efficiency.
What problem does this paper attempt to address?