Deep Structural Enhanced Network for Document Clustering

Ren Lina,Qin Yongbin,Chen Yanping,Bai Ruina,Xue Jingjing,Huang Ruizhang
DOI: https://doi.org/10.1007/s10489-022-04112-z
IF: 5.3
2022-01-01
Applied Intelligence
Abstract:Recently, deep document clustering, which employs deep neural networks to learn semantic document representation for clustering purpose, has attracted increasing research interests. Traditional deep document clustering models rely only the document internal content features for learning the representation and suffer from the insufficient problem of representation learning. In this paper, we introduce a deep structural enhanced network for document clustering, namely DSEDC. The DSEDC model enhances the AE-based internal document representation with GCN-based external structural document semantics for achieving better clustering performance. An ensemble-reinforced enhancement strategy is designed, in which a complete document representation, captured by fusing document internal semantics and external semantics, and an enhanced document internal representation, captured with the help of complete document representation, are learned in a layer-by-layer reinforcement manner. Extensive experiments demonstrated that our proposed DSEDC model performs substantially better than state-of-the-art deep document clustering models.
What problem does this paper attempt to address?