Domain Dictionary-Based Topic Modeling For Social Text

Bo Jiang,Jiguang Liang,Ying Sha,Rui Li,Lihong Wang
DOI: https://doi.org/10.1007/978-3-319-48740-3_8
2016-01-01
Abstract:Online social networks are becoming increasingly popular and posting large volumes of unstructured social text documents every day. Inferring topics from large-scale social texts is a significant but challenging task for many text mining applications. Conventional topic models has been shown unsatisfactory results due to the sparsity and noise of content in short texts. Besides, the learned topics are very difficult to understand the semantic information only by the top weighted terms. In this paper, we propose a novel social text topic modeling method to deal with the problems. The proposed model utilizes topic domain dictionary to construct a weakly supervised matrix, which can play a role of making reference matrix and the learned topic matrix become similar. Experimental results on the constructed social text dataset from Twitter demonstrate that our proposed method can outperform the state-of-the art baselines significantly and also improve the semantic relevancy of the learned topic.
What problem does this paper attempt to address?