Fast Spectral Clustering Learning with Hierarchical Bipartite Graph for Large-Scale Data
Xiaojun Yang,Weizhong Yu,Rong Wang,Guohao Zhang,Feiping Nie
DOI: https://doi.org/10.1016/j.patrec.2018.06.024
IF: 4.757
2018-01-01
Pattern Recognition Letters
Abstract:Spectral clustering (SC) is drawing more and more attention due to its effectiveness in unsupervised learning. However, all of these methods still have limitations. First, the method is not suitable for large-scale problems due to its high computational complexity. Second, the neighborhood weighted graph is constructed by the Gaussian kernel, meaning that more work is required to tune the heat-kernel parameter. In order to overcome these issues, we propose a novel spectral clustering based on hierarchical bipartite graph (SCHBG) approach by exploring multiple-layer anchors with a pyramid-style structure. First, the proposed algorithm constructs a hierarchical bipartite graph, and then performs spectral analysis on the graph. As a result, the computational complexity can be largely reduced. Furthermore, we adopt a parameter-free yet effective neighbor assignment strategy to construct the similarity matrix, which avoids the need to tune the heat-kernel parameter. Finally, the algorithm is able to deal with the out-of-sample problem for large-scale data and its computational complexity is significantly reduced. Experiments demonstrate the efficiency and effectiveness of the proposed SCHBG algorithm. Results show that the SCHBG approach can achieve good clustering accuracy (76%) on an 8-million datasets. Furthermore, owing to the use of the bipartite graph, the algorithm can reduce the time cost for out-of-sample situations with almost the same clustering accuracy as for large sizes of data. (C) 2018 Elsevier B.V. All rights reserved.