An Improved Semi-Supervised Clustering Algorithm For Multi-Density Datasets With Fewer Constraints

Xiaoyun Chen,Sha Liu,Tao Chen,Zhengquan Zhang,Hairong Zhang
DOI: https://doi.org/10.1016/j.proeng.2012.01.665
2012-01-01
Abstract:Semi-supervised clustering algorithms aim to significantly improve the clustering results using limited supervision in the form of labelled instances or pairwise constraints. But few of these algorithms are specially and well-designed both for multi-density and complex shape datasets. However, such complex data are usual in the real world. In this paper, an improved semi-supervised clustering algorithm is proposed based on SCMD algorithm. Our new algorithm can deal with the multi-density problems, including not only the inter-cluster density variety but also the intra-cluster density difference; and it can yield superior performance with fewer constraints. We test our new algorithm on several synthetic datasets of varying shapes, sizes, and densities. Experimental results show that our algorithm has manifest superior performance in comparison with SCMD algorithm, even when the constraints are not sufficient. (C) 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of Harbin University of Science and Technology
What problem does this paper attempt to address?