Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation

Ganbin Zhou,Ping Luo,Rongyu Cao,Yijun Xiao,Fen Lin,Bo Chen,Qing He

DOI: https://doi.org/10.48550/arXiv.1705.00321

2018-01-03

Abstract:Different from other sequential data, sentences in natural language are structured by linguistic grammars. Previous generative conversational models with chain-structured decoder ignore this structure in human language and might generate plausible responses with less satisfactory relevance and fluency. In this study, we aim to incorporate the results from linguistic analysis into the process of sentence generation for high-quality conversation generation. Specifically, we use a dependency parser to transform each response sentence into a dependency tree and construct a training corpus of sentence-tree pairs. A tree-structured decoder is developed to learn the mapping from a sentence to its tree, where different types of hidden states are used to depict the local dependencies from an internal tree node to its children. For training acceleration, we propose a tree canonicalization method, which transforms trees into equivalent ternary trees. Then, with a proposed tree-structured search method, the model is able to generate the most probable responses in the form of dependency trees, which are finally flattened into sequences as the system output. Experimental results demonstrate that the proposed X2Tree framework outperforms baseline methods over 11.15% increase of acceptance ratio.

Artificial Intelligence,Computation and Language,Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in the dialogue generation task in natural language processing, the existing generation models based on chain - structured decoders ignore the syntactic structure of human languages, resulting in generated responses that may seem reasonable but are not satisfactory in terms of relevance and fluency. Specifically, the paper points out that traditional sequence - to - sequence models usually adopt a linear decoding method when generating dialogues, and this method fails to fully utilize the syntactic structure of the language, especially the dependency tree, which may lead to generated sentences being not accurate enough in grammar and semantics. To improve this situation, the paper proposes a new method, that is, by introducing the results of linguistic analysis, especially the dependency parse tree, to guide the sentence generation process. This method aims to generate higher - quality dialogues by constructing a tree - structured decoder and learning the mapping from sentences to their dependency trees. In terms of specific implementation, the paper uses a dependency parser to convert each response sentence into a dependency tree and constructs a training corpus of sentence - tree pairs. In addition, to accelerate training, the paper also proposes a tree normalization method, which converts trees with different numbers of child nodes into equivalent ternary trees, which can simplify the implementation of the model on the GPU. Finally, through the proposed tree - structured search method, the model can generate the most likely responses in the form of dependency trees, and these trees are finally flattened into sequences as system outputs. Overall, the main contribution of this paper is to propose a new tree - structured decoder framework (X2TREE), which can not only generate higher - quality dialogues but also shows a higher acceptance rate than the baseline method in the experiment, specifically an increase of 11.15%.

Tree-Structured Neural Machine for Linguistics-Aware Sentence Generation

When Are Tree Structures Necessary for Deep Learning of Representations?

Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints

In Tree Structure Should Sentence Be Generated

On Tree-Based Neural Sentence Modeling

Explicit Syntactic Guidance for Neural Text Generation

Natural Language Generation by Hierarchical Decoding with Linguistic Patterns

Tree-based Convolution for Sentence Modeling.

Universal Dependencies Parsing For Colloquial Singaporean English

Top-Down Tree Structured Text Generation.

From Genesis to Creole Language

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues

TransSent: Towards Generation of Structured Sentences with Discourse Marker

Sentence-level heuristic tree search for long text generation

Chinese story generation of sentence format control based on multi-channel word embedding and novel data format

Syntax-guided Text Generation Via Graph Neural Network

Dependency-based Convolutional Neural Networks for Sentence Embedding

Structured and Natural Responses Co-generation for Conversational Search

TreeNet: Learning Sentence Representations with Unconstrained Tree Structure.

Dependency-based N-Gram Models for General Purpose Sentence Realisation

Multilingual Syntax-aware Language Modeling through Dependency Tree Conversion