Abstract:For guiding natural language generation, many semantic-driven methods have been proposed. While clearly improving the performance of the end-to-end training task, these existing semantic-driven methods still have clear limitations: for example, (i) they only utilize shallow semantic signals (e.g., from topic models) with only a single stochastic hidden layer in their data generation process, which suffer easily from noise (especially adapted for short-text etc.) and lack of interpretation; (ii) they ignore the sentence order and document context, as they treat each document as a bag of sentences, and fail to capture the long-distance dependencies and global semantic meaning of a document. To overcome these problems, we propose a novel semantic-driven language modeling framework, which is a method to learn a Hierarchical Language Model and a Recurrent Conceptualization-enhanced Gamma Belief Network, simultaneously. For scalable inference, we develop the auto-encoding Variational Recurrent Inference, allowing efficient end-to-end training and simultaneously capturing global semantics from a text corpus. Especially, this article introduces concept information derived from high-quality lexical knowledge graph Probase, which leverages strong interpretability and anti-nose capability for the proposed model. Moreover, the proposed model captures not only intra-sentence word dependencies, but also temporal transitions between sentences and inter-sentence concept dependence. Experiments conducted on several NLP tasks validate the superiority of the proposed approach, which could effectively infer meaningful hierarchical concept structure of document and hierarchical multi-scale structures of sequences, even compared with latest state-of-the-art Transformer-based models.

Sequence Modeling with Hierarchical Deep Generative Models with Dual Memory.

Learning to Generate with Memory

Hierarchical Topic Modeling with Nested Hierarchical Dirichlet Process

Characterizing A Database of Sequential Behaviors with Latent Dirichlet Hidden Markov Models

Learning Deep Generative Models with Doubly Stochastic Gradient MCMC

Generative Text Convolutional Neural Network for Hierarchical Document Representation Learning

Generative Modeling with Explicit Memory

Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences

Unified Generative and Discriminative Training for Multi-modal Large Language Models

Extending Memory for Language Modelling

On Hierarchical Multi-Resolution Graph Generative Models

Hierarchically Gated Recurrent Neural Network for Sequence Modeling

A dual channel class hierarchy based recurrent language modeling

Deep Generative Dual Memory Network for Continual Learning

Introducing the Hidden Neural Markov Chain framework

Hierarchical Adversarially Learned Inference

Hierarchical Concept-Driven Language Model.

A Transformer-Based Hierarchical Variational AutoEncoder Combined Hidden Markov Model for Long Text Generation

Sequence Modeling with Multiresolution Convolutional Memory

HMT: Hierarchical Memory Transformer for Long Context Language Processing

Emergent communication of multimodal deep generative models based on Metropolis-Hastings naming game