ULW-DMM: an Effective Topic Modeling Method for Microblog Short Text.

Jia Yu,Lirong Qiu
DOI: https://doi.org/10.1109/access.2018.2885987
IF: 3.9
2019-01-01
IEEE Access
Abstract:With the popularity of social media, including micro-blog, mining effective information in short texts has become an increasingly important issue. However, due to the sparseness, high dimensionality and large amount of data, mining this information is a very challenging task. In this paper, we propose a method to extend the Dirichlet multinomial mixture (DMM) topic model by combining the user-LDA topic model based on internal data expansion with the potential feature vector representation of words trained on a very large external corpus (we refer to it as ULW-DMM). The experimental results show that the ULW-DMM model produces a relatively large improvement in topic consistency and classification tasks for topic modeling of microblog short texts.
What problem does this paper attempt to address?