Scalable Constrained Spectral Clustering

Jianyuan Li,Yingjie Xia,Zhenyu Shan,Yuncai Liu
DOI: https://doi.org/10.1109/TKDE.2014.2356471
2015-01-01
Abstract:Constrained spectral clustering (CSC) algorithms have shown great promise in significantly improving clustering accuracy by encoding side information into spectral clustering algorithms. However, existing CSC algorithms are inefficient in handling moderate and large datasets. In this paper, we aim to develop a scalable and efficient CSC algorithm by integrating sparse coding based graph construction into a framework called constrained normalized cuts. To this end, we formulate a scalable constrained normalized-cuts problem and solve it based on a closed-form mathematical analysis. We demonstrate that this problem can be reduced to a generalized eigenvalue problem that can be solved very efficiently. We also describe a principled k-way CSC algorithm for handling moderate and large datasets. Experimental results over benchmark datasets demonstrate that the proposed algorithm is greatly cost-effective, in the sense that (1) with less side information, it can obtain significant improvements in accuracy compared to the unsupervised baseline; (2) with less computational time, it can achieve high clustering accuracies close to those of the state-of-the-art.
What problem does this paper attempt to address?