A Flexible Convex Optimization Model for Semi-supervised Clustering with Instance-level Constraints

xianwen ren,yong wang,xiangsun zhang
2011-01-01
Abstract:Clustering is a common task in many applications e.g. digital image processing, text mining and bioinformatics. Many techniques such as k-means, hierarchical clustering and spectral clustering, have been proposed. In a previous study, we proposed a quadratic programming model to address the fuzzy binary clustering problem in the unsupervised setting and then extended it to the general clustering problem. In this paper, we extend further the model in the semi-supervised setting. It has three salient characteristics. First, both the label and link information of known samples can be integrated easily. Second, it illustrates the linkage between the hard binary clustering and fuzzy binary clustering in one framework, suggesting the benefits of fuzzy binary clustering theoretically. Third, a fast iterative algorithm is proposed, which can be applied to very large data sets. Numerical experiments on two data sets suggest its practical effectiveness and efficiency.
What problem does this paper attempt to address?