Abstract:Deep probabilistic aspect models are widely utilized in document analysis to extract the semantic information and obtain descriptive topics. However, there are two problems that may affect their applications. One is that common words shared among all documents with low representational meaning may reduce the representation ability of learned topics. The other is introducing supervision information to hierarchical topic models to fully utilize the side information of documents that is difficult. To address these problems, in this article, we first propose deep diverse latent Dirichlet allocation (DDLDA), a deep hierarchical topic model that can yield more meaningful semantic topics with less common and meaningless words by introducing shared topics. Moreover, we develop a variational inference network for DDLDA, which helps us to further generalize DDLDA to a supervised deep topic model called max-margin DDLDA (mmDDLDA) by employing max-margin principle as the classification criterion. Compared to DDLDA, mmDDLDA can discover more discriminative topical representations. In addition, a continual hybrid method with stochastic-gradient MCMC and variational inference is put forward for deep latent Dirichlet allocation (DLDA)-based models to make them more practical in real-world applications. The experimental results demonstrate that DDLDA and mmDDLDA are more efficient than existing unsupervised and supervised topic models in discovering highly discriminative topic representations and achieving higher classification accuracy. Meanwhile, DLDA and our proposed models trained by the proposed continual learning approach cannot only show good performance on preventing catastrophic forgetting but also fit the evolving new tasks well.

Latent Dirichlet Allocation - An approach for topic discovery

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

A Spectral Algorithm for Latent Dirichlet Allocation

LDAExplore: Visualizing Topic Models Generated Using Latent Dirichlet Allocation

Topic Modeling based Consumer behavior analysis using Latent Dirichlet Allocation

Topic Modeling on Online News.Portal Using Latent Dirichlet Allocation (LDA)

Deep Learning based Topic Analysis on Financial Emerging Event Tweets

Novel mixture allocation models for topic learning

Max-Margin Deep Diverse Latent Dirichlet Allocation With Continual Learning

News Topic Discovery Through Community Detection

Topic Analysis for Text with Side Data

Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis

Search and classify topics in a corpus of text using the latent dirichlet allocation model

Latent Dirichlet allocation (LDA) for topic modeling of the CFPB consumer complaints

Constrained Latent Dirichlet Allocation For Subgroup Discovery With Topic Rules

DATM: A Novel Data Agnostic Topic Modeling Technique With Improved Effectiveness for Both Short and Long Text

Investigating topic modeling techniques through evaluation of topics discovered in short texts data across diverse domains

Performance evaluation of Latent Dirichlet Allocation on legal documents

Deep de Finetti: Recovering Topic Distributions from Large Language Models

Prior-Based Dual Additive Latent Dirichlet Allocation for User-Item Connected Documents.

Incorporating Hierarchical Dirichlet Process into Tag Topic Model.