Abstract:Different from other sequential data, sentences in natural language are structured by linguistic grammars. Previous generative conversational models with chain-structured decoder ignore this structure in human language and might generate plausible responses with less satisfactory relevance and fluency. In this study, we aim to incorporate the results from linguistic analysis into the process of sentence generation for high-quality conversation generation. Specifically, we use a dependency parser to transform each response sentence into a dependency tree and construct a training corpus of sentence-tree pairs. A tree-structured decoder is developed to learn the mapping from a sentence to its tree, where different types of hidden states are used to depict the local dependencies from an internal tree node to its children. For training acceleration, we propose a tree canonicalization method, which transforms trees into equivalent ternary trees. Then, with a proposed tree-structured search method, the model is able to generate the most probable responses in the form of dependency trees, which are finally flattened into sequences as the system output. Experimental results demonstrate that the proposed X2Tree framework outperforms baseline methods over 11.15% increase of acceptance ratio.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in the dialogue generation task in natural language processing, the existing generation models based on chain - structured decoders ignore the syntactic structure of human languages, resulting in generated responses that may seem reasonable but are not satisfactory in terms of relevance and fluency. Specifically, the paper points out that traditional sequence - to - sequence models usually adopt a linear decoding method when generating dialogues, and this method fails to fully utilize the syntactic structure of the language, especially the dependency tree, which may lead to generated sentences being not accurate enough in grammar and semantics.
To improve this situation, the paper proposes a new method, that is, by introducing the results of linguistic analysis, especially the dependency parse tree, to guide the sentence generation process. This method aims to generate higher - quality dialogues by constructing a tree - structured decoder and learning the mapping from sentences to their dependency trees. In terms of specific implementation, the paper uses a dependency parser to convert each response sentence into a dependency tree and constructs a training corpus of sentence - tree pairs. In addition, to accelerate training, the paper also proposes a tree normalization method, which converts trees with different numbers of child nodes into equivalent ternary trees, which can simplify the implementation of the model on the GPU. Finally, through the proposed tree - structured search method, the model can generate the most likely responses in the form of dependency trees, and these trees are finally flattened into sequences as system outputs.
Overall, the main contribution of this paper is to propose a new tree - structured decoder framework (X2TREE), which can not only generate higher - quality dialogues but also shows a higher acceptance rate than the baseline method in the experiment, specifically an increase of 11.15%.