Efficient Topic Modeling on Phrases via Sparsity

Weijing Huang,Wei Chen,Tengjiao Wang,Shibo Tao
DOI: https://doi.org/10.1109/ICTAI.2017.00050
2017-01-01
Abstract:Topic modeling on phrases is important in understanding documents by providing interpretable topics. But existing methods are not as efficient as the topic modeling methods on words, which may limit their potential application.Towards providing a more efficient method, we propose a novel topic model SparseTP, which (1) models the words and phrases by linking them in Markov Random Field when necessary; (2) provides a well-formed lower bound of the model for Gibbs sampling; (3) utilizes the sparse distribution of words and phrases on topics to speed up the inference. The experiments demonstrate that it can achieve the high efficiency without sacrificing the effectiveness.
What problem does this paper attempt to address?