Fluent Response Generation for Conversational Question Answering

Ashutosh Baheti,Alan Ritter,Kevin Small
DOI: https://doi.org/10.48550/arXiv.2005.10464
2020-12-17
Abstract:Question answering (QA) is an important aspect of open-domain conversational agents, garnering specific research focus in the conversational QA (ConvQA) subtask. One notable limitation of recent ConvQA efforts is the response being answer span extraction from the target corpus, thus ignoring the natural language generation (NLG) aspect of high-quality conversational agents. In this work, we propose a method for situating QA responses within a SEQ2SEQ NLG approach to generate fluent grammatical answer responses while maintaining correctness. From a technical perspective, we use data augmentation to generate training data for an end-to-end system. Specifically, we develop Syntactic Transformations (STs) to produce question-specific candidate answer responses and rank them using a BERT-based classifier (Devlin et al., 2019). Human evaluation on SQuAD 2.0 data (Rajpurkar et al., 2018) demonstrate that the proposed model outperforms baseline CoQA and QuAC models in generating conversational responses. We further show our model's scalability by conducting tests on the CoQA dataset. The code and data are available at <a class="link-external link-https" href="https://github.com/abaheti95/QADialogSystem" rel="external noopener nofollow">this https URL</a>.
Computation and Language
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is that in the current Conversational Question Answering (ConvQA) systems, the generated answers rely too much on directly extracting text fragments from documents and lack the ability of Natural Language Generation (NLG). As a result, although the generated answers are accurate, they are not smooth and natural enough and are not suitable for multi - turn dialogue scenarios. To overcome this limitation, the paper proposes a method based on the Sequence - to - Sequence (SEQ2SEQ) model, aiming to generate conversational answers that are both accurate and fluent. Specifically, the paper solves the problem through the following points: 1. **Data Augmentation**: Use existing question - answering datasets (such as SQuAD), and generate training data through Syntactic Transformations (STs) and BERT classifiers to enhance the learning ability of the model. 2. **Model Design**: Develop a generation model based on Pointer - Generator Networks (PGN) and pre - trained Transformer models (such as DialoGPT), which can generate more natural and fluent answers. 3. **Evaluation and Verification**: Through manual evaluation and automatic evaluation, verify the performance of the proposed model in generating fluent and accurate conversational answers, especially on the SQuAD 2.0 and CoQA datasets. The goal of the paper is to improve the answer quality of the Conversational Question Answering system, making it not only accurate, but also more natural and suitable for multi - turn dialogue environments.