CSMDC: Exploring Consistently Context Semantics for Multi-View Document Clustering

Ruina Bai,Ruizhang Huang,Le Xu,Yongbin Qin
DOI: https://doi.org/10.1016/j.eswa.2024.125386
2025-01-01
Abstract:Multi-view document clustering, which aims to discover clustering partitions based on multiple document views, has attracted increasing research interest. However, the potential advantages of incorporating context semantics to enhance multi-view document clustering are yet to be fully explored. To address the above limitation, we propose a deep multi-view document clustering model that explores consistent context semantics called CSMDC which consists of three modules. Specifically, a novel view-translator is designed to convert non-contextual document views into contextual views. With its help, all document views can be processed to obtain their semantic representations within the view-translator representation learning module. Then the data-based view consistency self-supervising module is developed to fine-tune the semantic representations of document views by jointly incorporating view-wise representation relevance and consistent clustering assignments. Additionally, the task-based document clustering module is employed to simultaneously improve the view semantic representations and document clustering results. To the best of our knowledge, this is the first study to explicitly apply consistent context semantics under the guidance of data-based and task- based objectives in multi-view document clustering. Comprehensive experimental results demonstrated the effectiveness of the proposed model.
What problem does this paper attempt to address?