Semi-supervised clustering method based on active learning strategy

LU Shi-dan,CUI Rong-yi
DOI: https://doi.org/10.3969/j.issn.1001-3695.2013.06.030
2013-01-01
Abstract:By employing active learning strategy to learn informative dataset to be labeled,this paper proposed a semi-supervised clustering method based on active learning strategy.Firstly,it employed traditional K-means algorithm to make coarse clustering for unlabeled dataset.And furthermore,based on the result of coarse clustering,it calculated the membership degree of each data belonging to each cluster,then screened out alternative data of which the difference between maximum and the second maximum membership degree was lower than threshold,then the partial data would be labeled if the difference of which was relatively small,i.e.,the data were informative samples.Finally,they grouped each selected unlabeled data to corresponding labeled cluster which acquired minimum average distances.The experimental results show that the proposed active learning strategy is very powerful to learn informative data,and the semi-supervised clustering method based on active learning strategy is quite accurate with regards to various dataset.
What problem does this paper attempt to address?