Deep cross-modal subspace clustering with Contrastive Neighbour Embedding

Zihao Zhang,Qianqian Wang,Chengquan Pei,Quanxue Gao
DOI: https://doi.org/10.1016/j.neucom.2024.127318
IF: 6
2024-01-31
Neurocomputing
Abstract:Deep cross-modal clustering has been developing rapidly and attracted considerable attention in recent years. It aims to pursue a consistent subspace from different modalities with deep neural networks and achieves remarkable clustering performance. However, most existing methods do not simultaneously consider the inherently diverse information of each modality and the neighbour geometric structure over cross-modal data, which inevitably degrades the cluster structure revealed by the common subspace. In this paper, we propose a novel method named Deep Cross-Modal Subspace Clustering with Contrastive Neighbour Embedding (DCSC-CNE) to address the above challenge. DCSC-CNE maintains the inherent independence of each modality while concurrently uncovering consistent information across diverse modalities. In addition, we introduce a contrastive neighbour graph in the proposed deep cross-modal subspace clustering framework by performing contrastive learning between positive and negative samples, to highlight the underlying neighbour geometry of the original data and learn discriminative latent (subspace) representations. In this way, DCSC-CNE integrates the consistent-inherent learning and the contrastive neighbour embedding into a unified deep learning framework. Experimental results demonstrate that the proposed method can significantly improve the cross-modal subspace clustering performance compared with state-of-the-art methods on six benchmark datasets.
computer science, artificial intelligence
What problem does this paper attempt to address?