Abstract:We address two challenges of probabilistic topic modelling in order to better estimate the probability of a word in a given context, i.e., P(word|context): (1) No Language Structure in Context: Probabilistic topic models ignore word order by summarizing a given context as a "bag-of-word" and consequently the semantics of words in the context is lost. The LSTM-LM learns a vector-space representation of each word by accounting for word order in local collocation patterns and models complex characteristics of language (e.g., syntax and semantics), while the TM simultaneously learns a latent representation from the entire document and discovers the underlying thematic structure. We unite two complementary paradigms of learning the meaning of word occurrences by combining a TM (e.g., DocNADE) and a LM in a unified probabilistic framework, named as ctx-DocNADE. (2) Limited Context and/or Smaller training corpus of documents: In settings with a small number of word occurrences (i.e., lack of context) in short text or data sparsity in a corpus of few documents, the application of TMs is challenging. We address this challenge by incorporating external knowledge into neural autoregressive topic models via a language modelling approach: we use word embeddings as input of a LSTM-LM with the aim to improve the word-topic mapping on a smaller and/or short-text corpus. The proposed DocNADE extension is named as ctx-DocNADEe. We present novel neural autoregressive topic model variants coupled with neural LMs and embeddings priors that consistently outperform state-of-the-art generative TMs in terms of generalization (perplexity), interpretability (topic coherence) and applicability (retrieval and classification) over 6 long-text and 8 short-text datasets from diverse domains.

A Joint Topical N-Gram Language Model Based on LDA

On Modelling Non-Linear Topical Dependencies

A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings

A Joint Model of Conversational Discourse and Latent Topics on Microblogs

Statistical Word Sense Aware Topic Models

Joint Latent Dirichlet Allocation for Social Tags.

Topic Models Incorporating Statistical Word Senses

LTSG: Latent Topical Skip-Gram for Mutually Learning Topic Model and Vector Representations

Source-LDA: Enhancing probabilistic topic models using prior knowledge sources

Adaptation of Language Models for SMT Using Neural Networks with Topic Information.

Jointly Modeling Topics and Intents with Global Order Structure.

Refine Bigram PLSA Model by Assigning Latent Topics Unevenly

Joint n-gram Chinese language modeling with an application to Chinese word segmentation

Contextual-LDA: A Context Coherent Latent Topic Model for Mining Large Corpora.

A Context-Aware Topic Model for Statistical Machine Translation.

Locally discriminative topic modeling

textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior

Joint Topic-Document Modeling Via Low-Dimensional Sparse Models

Neural Topic Modeling with Large Language Models in the Loop

Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora

Sentiment Analysis with Global Topics and Local Dependency