Subspace-Contrastive Multi-View Clustering

Fu Lele,Zhang Lei,Yang Jinghua,Chen Chuan,Zhang Chuanfu,Zheng Zibin
DOI: https://doi.org/10.48550/arXiv.2210.06795
2022-10-13
Abstract:Most multi-view clustering methods are limited by shallow models without sound nonlinear information perception capability, or fail to effectively exploit complementary information hidden in different views. To tackle these issues, we propose a novel Subspace-Contrastive Multi-View Clustering (SCMC) approach. Specifically, SCMC utilizes view-specific auto-encoders to map the original multi-view data into compact features perceiving its nonlinear structures. Considering the large semantic gap of data from different modalities, we employ subspace learning to unify the multi-view data into a joint semantic space, namely the embedded compact features are passed through multiple self-expression layers to learn the subspace representations, respectively. In order to enhance the discriminability and efficiently excavate the complementarity of various subspace representations, we use the contrastive strategy to maximize the similarity between positive pairs while differentiate negative pairs. Thus, a weighted fusion scheme is developed to initially learn a consistent affinity matrix. Furthermore, we employ the graph regularization to encode the local geometric structure within varying subspaces for further fine-tuning the appropriate affinities between instances. To demonstrate the effectiveness of the proposed model, we conduct a large number of comparative experiments on eight challenge datasets, the experimental results show that SCMC outperforms existing shallow and deep multi-view clustering methods.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve two main problems in multi - view clustering methods: 1. **Shallow models lack effective nonlinear information - perception ability**: Most existing multi - view clustering methods rely on shallow models, which are unable to effectively capture the nonlinear structures in data. Therefore, when dealing with high - dimensional and complex real - world data, the performance of these methods may be limited. 2. **Failure to fully utilize complementary information between different views**: Existing methods often fail to effectively mine the complementary information hidden in different views, which leads to insufficient clustering performance. To solve these problems, the authors propose a new Subspace - Contrastive Multi - View Clustering (SCMC) method. The main contributions and solutions of SCMC include: - **Utilizing view - specific Auto - Encoders (AEs)**: Map the original multi - view data into a compact feature space to perceive its nonlinear structure. In this way, high - dimensional and nonlinear data can be better processed. - **Subspace learning to unify multi - view data**: In order to bridge the semantic gap between different modal data, SCMC adopts a subspace learning method, passing the embedded compact features through multiple self - expressive layers, thereby learning subspace representations. This can unify data from different modalities into a common semantic space. - **Contrast strategy to enhance discriminative ability and mine complementary information**: In order to enhance the discriminative ability of each subspace representation and efficiently mine the complementary information between different subspace representations, SCMC uses a contrast strategy to maximize the similarity between positive sample pairs while distinguishing negative sample pairs. Specifically, the representations of the same sample under different views are regarded as positive sample pairs, while other sample pairs are regarded as negative sample pairs. - **Weighted fusion scheme and graph regularization**: In order to obtain a consistent affinity matrix, SCMC develops a weighted fusion scheme and applies graph regularization to encode the local geometric structure within the learned subspaces, thereby further adjusting the appropriate affinity between instances. Through the above methods, SCMC can perform excellently on multiple challenging datasets, outperforming existing shallow and deep multi - view clustering methods. ### Mathematical formula summary - **Auto - encoder encoding process**: \[ C^{(v)} = f_v(X^{(v)}|W_e^{(v)}, b_e^{(v)}) \] where \(C^{(v)}\) is the embedded feature of the \(v\)-th view, \(f_v(\cdot)\) is the encoder function, and \(W_e^{(v)}\) and \(b_e^{(v)}\) are the weight matrix and bias vector of the encoder respectively. - **Subspace representation learning loss**: \[ L_{\text{Sub}}=\sum_{v = 1}^V\|C^{(v)T}-C^{(v)}Z^{(v)}\|_F^2 \] - **Contrast loss**: \[ \ell_v = -\frac{1}{N}\sum_{i = 1}^N\log\frac{\exp(\Theta(Z_i^{(v)}, Z_i^{(k)})/\tau)}{\sum_{j = 1}^N\left(\exp(\Theta(Z_i^{(v)}, Z_j^{(v)})/\tau)+\exp(\Theta(Z_i^{(v)}, Z_j^{(k)})/\tau)\right)} \] \[ L_{\text{Con}}=\frac{1}{NV}\sum_{v = 1}^V\ell_v \] where \(\Theta(Z_i^{(v)}, Z_j^{(k)})=\frac{(Z_i^{(v)})^T(Z_j^{(k)})}{\|Z_i^{(v)}\|\|Z_j^{(k)}\|}\) is the cosine similarity and \(\tau\) is the temperature.