Hierarchical Neural Topic Model with Embedding Cluster and Neural Variational Inference.
Ningjing Wang,Deqing Wang,Ting Jiang,Chenguang Du,Chuyu Fang,Fuzhen Zhuang
DOI: https://doi.org/10.1137/1.9781611977653.ch105
2023-01-01
Abstract:Compared to flat topic models, hierarchical topic models not only exploit inherent structural information in the cor-pus but detect better semantic topics with the help of hierarchy knowledge. Recently, Neural-Variational-Inference (NVI) based hierarchical neural topic models have achieved better performance. However, existing NVI-based models learn topics of different levels with the same strategy, i.e., word co-occurrence patterns, which causes that topics of different levels cannot be distinguished from a semantic perspective and topics of the first level degenerate into some meaningless common words. To address the above problems, we propose a novel Hierarchical Neural Topic Model with embedding cluster and neural variational inference (C-HNTM). Specifically, C-HNTM adopts Gaussian Mixture Model (GMM) to learn topics of the first level based on word embeddings, which can capture the global semantic information of the whole corpus and generate more meaningful and global semantic topics. Then, the NVI-based method is adopted to learn topics of the second level with Bag-of-Word from a document perspective, which can generate local and more detailed topics. Third, we simultaneously learn global and local topic distributions and dependency matrix by using Stochastic Gradient Variational Bayes (SGVB) estimator. Finally, we provide the detailed inference of variational lower bound and extensive experiments on three real-world datasets to validate the effectiveness of our model.