Robust Word-Network Topic Model For Short Texts

Fei Wang,Rui Liu,Yuan Zuo,Hui Zhang,He Zhang,Junjie Wu
DOI: https://doi.org/10.1109/ICTAI.2016.0132
2016-01-01
Abstract:With the rapid development of online social media, the short text has become the prevalent format for information of Internet. Due to the severe data sparsity issue, accurately discovering knowledge behind these short texts remains a critical challenge. Since regular topic models, such as the Latent Dirichlet Allocation (LDA), can not perform well on short texts, many efforts have been put on building different types of probabilistic topic models for short texts. Inducing topics from dense word-word space instead of sparse document-word space becomes an emerging solution for avoiding data sparsity issue, and the representative one is the Word Network Topic Model (WNTM). However, the word-word space building procedure of WNTM often imports much irrelevant information. In light of this, we propose the Robust WNTM (RWNTM), which can filter out unrelated information during the sampling. The experimental results demonstrate that our method can learn more coherent topics and is more accurate in text classification, as compared with WNTM and other state-of-the-arts.
What problem does this paper attempt to address?