Abstract:With the emergence and development of deep generative models, such as the variational auto-encoders (VAEs), the research on topic modeling successfully extends to a new area: neural topic modeling, which aims to learn disentangled topics to understand the data better. However, the original VAE framework had been shown to be limited in disentanglement performance, bringing their inherent defects to a neural topic model (NTM). In this paper, we put forward that the optimization objectives of contrastive learning are consistent with two important goals (alignment and uniformity) of well-disentangled topic learning. Also, the optimization objectives of contrastive learning are consistent with two key evaluation measures for topic models, topic coherence and topic diversity. So, we come to the important conclusion that alignment and uniformity of disentangled topic learning can be quantified with topic coherence and topic diversity. Accordingly, we are inspired to propose the Contrastive Disentangled Neural Topic Model (CNTM). By representing both words and topics as low-dimensional vectors in the same embedding space, we apply contrastive learning to neural topic modeling to produce factorized and disentangled topics in an interpretable manner. We compare our proposed CNTM with strong baseline models on widely-used metrics. Our model achieves the best topic coherence scores under the most general evaluation setting (100% proportion topic selected) with 25.0%, 10.9%, 24.6%, and 51.3% improvements above the second-best models’ scores reported on four datasets of 20 Newsgroups, Web Snippets, Tag My News, and Reuters, respectively. Our method also gets the second-best topic diversity scores on the dataset of 20Newsgroups and Web Snippets. Our experimental results show that CNTM can effectively leverage the disentanglement ability from contrastive learning to solve the inherent defect of neural topic modeling and obtain better topic quality.

Topic Modeling as Multi-Objective Contrastive Optimization

Dual Path Structural Contrastive Embeddings for Learning Novel Objects

Contrastive Learning for Neural Topic Model

Improving Topic Disentanglement Via Contrastive Learning

Enhancing Topic Interpretability for Neural Topic Modeling Through Topic-Wise Contrastive Learning

Mitigating Data Sparsity for Short Text Topic Modeling by Topic-Semantic Contrastive Learning

Contrastive estimation reveals topic posterior information to linear models

On the Comparison between Multi-modal and Single-modal Contrastive Learning

Graph Contrastive Topic Model

Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings

Multilingual and Multimodal Topic Modelling with Pretrained Embeddings

Modeling Dynamic Topics in Chain-Free Fashion by Evolution-Tracking Contrastive Learning and Unassociated Word Exclusion

$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs

Multi-Similarity Contrastive Learning

Multi-Task Learning with Multi-Task Optimization

Multimodal Contrastive Learning via Uni-Modal Coding and Cross-Modal Prediction for Multimodal Sentiment Analysis

Aligning Visual Contrastive learning models via Preference Optimization

CoTE: A Flexible Method for Joint Learning of Topic and Embedding Models

Contrastive Multimodal Fusion with TupleInfoNCE

Optimizing Non-Autoregressive Transformers with Contrastive Learning

Contrastive Data and Learning for Natural Language Processing