Deep Multi-View Document Clustering with Enhanced Semantic Embedding.

Ruina Bai,Ruizhang Huang,Yanping Chen,Yongbin Qin
DOI: https://doi.org/10.1016/j.ins.2021.02.027
IF: 8.1
2021-01-01
Information Sciences
Abstract:Multi-view clustering, which aims to group data with multiple views, has recently attracted intense research attention. Text documents bring additional difficulties to multi-view clustering due to the sparseness, high dimensionality, and inconsistency of document views. In this paper, we introduced a novel model on multi-view document clustering with enhanced semantic embedding, namely, MDCE, to address all of the above difficulties of clustering text documents with more than one representation view. Enhanced semantic embedders are designed to learn and improve the semantic mapping from higher-dimensional document space to lower-dimensional feature space with complementary semantic information. Specifically, three types of complementary semantic information are involved in an unsupervised manner: neighbour-wise, view-wise, and cluster-wise complementary information. A deep network is designed to optimize the enhanced semantic mapping, integrate lower-dimensional features from multiple views, and discover document clustering assignments simultaneously. We conducted extensive experiments on our proposed MDCE model by using realistic datasets compared with a number of state-of-the-art multi-view clustering approaches. Experimental results demonstrate that the MDCE-related models perform substantially better than all other models.
What problem does this paper attempt to address?