Scalable Multi-view Spectral Clustering Based on Spectral Perturbation Theory

Xiang Lin,Weixuan Liang,Jiyuan Liu
DOI: https://doi.org/10.1145/3674399.3674434
2024-01-01
Abstract:Multi-view spectral clustering (MVSC) is a typical unsupervised data analysis method in literature. It aims to integrate the complementary information of different data views for higher clustering accuracy. Under the assumption that all views share a unified clustering structure, the base Laplacian matrices can be regarded as different perturbations of a consensus Laplacian matrix. On this basis, a set of practical MVSC algorithms are designed. However, almost all of them suffer from high computational complexity due to the construction and processing of the matrices whose size is square of sample number. To address the above issues, we propose a scalable multi-view spectral clustering based on spectral perturbation theory to handle large-scale datasets and learn diverse information from each base views. Specifically, we first construct bipartite graphs for all base views and aim to learn a consensus Laplacian matrix of these base bipartite graphs. Based on a perturbation theory of singular subspace, we design an objective function that can minimize the discrepancy of the canonical angles between the consensus Laplacian matrix and base Laplacian matrices. Moreover, we impose a matrix-induced regularization item to increase the diversity of base views. We design a simple but efficient method to solve the resultant problem. The complexity of the optimization method can be proven linear to the sample number. Finally, we conduct extensive experiments in benchmark datasets to verify the efficiency and effectiveness of the proposed method.
What problem does this paper attempt to address?