Online Topic Evolution Modeling Based on Hierarchical Dirichlet Process.

Tao Ma,Dacheng Qu,Rui Ma,Wei Feng,Kan Li
DOI: https://doi.org/10.1109/dsc.2016.60
2016-01-01
Abstract:This paper presents a model based on Hierarchical Dirichlet Process (HDP), that automatically captures the evolutionary thematic patterns in texts. Our approach allows HDP to work in an online fashion, such that it can build an up-to-date model for new documents given the old model, without accessing historic data. Since exact calculation is infeasible, we turn to Gibbs sampling to carry out approximate posterior inference. After the topics are found, we can analyze the evolution relationships between time-adjacent topics. Experiments on a real world dataset (Reuters-21578) validate the effectiveness of the model quantitatively, showing its advantage over both OLDA and plain HDP in modeling topic evolution.
What problem does this paper attempt to address?