Incorporating Probabilistic Knowledge into Topic Models.

Liang Yao,Yin Zhang,Baogang Wei,Hongze Qian,Yibing Wang
DOI: https://doi.org/10.1007/978-3-319-18032-8_46
2015-01-01
Abstract:Probabilistic Topic Models could be used to extract low-dimension aspects from document collections. However, such models without any human knowledge often produce aspects that are not interpretable. In recent years, a number of knowledge-based models have been proposed, which allow the user to input prior knowledge of the domain to produce more coherent and meaningful topics. In this paper, we incorporate human knowledge in the form of probabilistic knowledge base into topic models. By combining latent Dirichlet allocation, a widely used topic model with Probase, a large-scale probabilistic knowledge base, we improve the semantic coherence significantly. Our evaluation results will demonstrate the effectiveness of our method.
What problem does this paper attempt to address?