Sentence level topic models for associated topics extraction

Haixin Jiang,Rui Zhou,Limeng Zhang,Hua Wang,Yanchun Zhang
DOI: https://doi.org/10.1007/s11280-018-0639-1
2018-10-18
World Wide Web
Abstract:In LDA model, independence assumptions in the Dirichlet distribution of the topic proportions lead to the inability to model the connections between topics. Some researchers have attempted to break them and thus obtained more powerful topic models. Following this strategy, by using an association matrix to measure the association between latent topics, we develop an associated topic model (ATM), in which consecutive sentences are considered important and the topic assignments for words are jointly determined by the association matrix and the sentence level topic distributions, instead of the document-specific topic distributions only. This approach gives a more realistic modeling of latent topic connections where the presence of a topic may be connected with the presence of another. We derive a collapsed Gibbs sampling algorithm for inference and parameter estimation for the ATM. The experimental results demonstrate that the ATM gives a more practical interpretation and is capable of learning more associated topics.
What problem does this paper attempt to address?