Learning Multi-turn Response Selection in Grounded Dialogues with Reinforced Knowledge and Context Distillation

Jiazhan Feng,Chongyang Tao,Xueliang Zhao,Dongyan Zhao
DOI: https://doi.org/10.1145/3584701
IF: 4.657
2023-04-21
ACM Transactions on Information Systems
Abstract:Recently, knowledge-grounded dialogue systems have gained increasing attention. Great efforts have been made to build response matching models where all dialogue content and knowledge sentences are leveraged. However, knowledge redundancy and distraction of irrelevant dialogue content often exist in knowledge-grounded conversations, which may affect the matching process and lead to inferior performance. In addition, irrelevant dialogue history and excessive knowledge also hinder the exploitation of popular pre-trained language models (PLMs) due to the limitation of input length. To address these challenges, we propose a new knowledge-grounded dialogue model based on PLMs, where a knowledge selector and a context selector are designed for filtering out irrelevant knowledge sentences and redundant dialogue history, respectively. Considering the lack of labeled data for the learning of two selectors, we pre-train them with weakly-supervised tasks and then jointly conduct the optimization of knowledge and context selection and fine-tuning of PLMs for response ranking with reinforcement learning (RL). By this means, the dialogue model can distill more accurate and concise knowledge and dialogue content for subsequent response ranking module, and the overall model can converge and perform better. We conduct experiments on two benchmarks and evaluation results indicate that our model can significantly outperform the state-of-the-art methods.
computer science, information systems
What problem does this paper attempt to address?