UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems

Hongru Wang,Wenyu Huang,Yang Deng,Rui Wang,Zezhong Wang,Yufei Wang,Fei Mi,Jeff Z. Pan,Kam-Fai Wong
2024-09-19
Abstract:Large Language Models (LLMs) has shown exceptional capabilities in many natual language understanding and generation tasks. However, the personalization issue still remains a much-coveted property, especially when it comes to the multiple sources involved in the dialogue system. To better plan and incorporate the use of multiple sources in generating personalized response, we firstly decompose it into three sub-tasks: Knowledge Source Selection, Knowledge Retrieval, and Response Generation. We then propose a novel Unified Multi-Source Retrieval-Augmented Generation system (UniMS-RAG) Specifically, we unify these three sub-tasks with different formulations into the same sequence-to-sequence paradigm during the training, to adaptively retrieve evidences and evaluate the relevance on-demand using special tokens, called acting tokens and evaluation tokens. Enabling language models to generate acting tokens facilitates interaction with various knowledge sources, allowing them to adapt their behavior to diverse task requirements. Meanwhile, evaluation tokens gauge the relevance score between the dialogue context and the retrieved evidence. In addition, we carefully design a self-refinement mechanism to iteratively refine the generated response considering 1) the consistency scores between the generated response and retrieved evidence; and 2) the relevance scores. Experiments on two personalized datasets (DuLeMon and KBP) show that UniMS-RAG achieves state-of-the-art performance on the knowledge source selection and response generation task with itself as a retriever in a unified manner. Extensive analyses and discussions are provided for shedding some new perspectives for personalized dialogue systems.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the task of personalized knowledge - enhanced response generation (PerDS) in dialogue systems, especially when multiple knowledge sources are involved. Specifically, existing knowledge - enhanced dialogue systems either focus only on a single knowledge source or use all knowledge sources indiscriminately, ignoring the possible existence of multiple knowledge sources and the complex relationships between them in actual scenarios. In addition, existing methods usually train the retriever and the reader independently, resulting in sub - optimal performance and distribution differences between the retriever and the reader; or design complex architectures to optimize them simultaneously, which is impractical in the era of large - language models due to high computational costs. To address the above challenges, the authors first decompose the task of personalized knowledge - enhanced dialogue response generation into three distinct subtasks: 1. **Knowledge Source Selection** (Planner): Plan the order of knowledge source invocation according to the dialogue context, considering the independent or dependent relationships between different sources. 2. **Knowledge Retrieval** (Retriever): Sequentially retrieve the top n pieces of evidence from external sources according to the decision in the previous step. 3. **Response Generation** (Reader): Generate a knowledge - enhanced natural - language response based on the original dialogue context and the retrieved evidence. Then, the authors design a new framework - the Unified Multi - Source Retrieval - Augmented Dialogue System (UniMS - RAG), which unifies these three tasks in a sequence - to - sequence (Seq2Seq) manner using the same large - language model. By introducing two special tokens - acting tokens and evaluation tokens, UniMS - RAG can adaptively retrieve evidence and evaluate relevance, thereby reformulating the above three subtasks as token - prediction tasks during the training process. In addition, the authors also design a self - refinement mechanism that re - evaluates the generated response by leveraging the feedback from the evaluation tokens during the inference stage, ensuring that the generated response is consistent with the provided evidence and has a high relevance score. Experimental results show that UniMS - RAG outperforms previous strong baseline models on two personalized datasets (DuLeMon and KBP), and reaches a new state - of - the - art level when using more advanced external retrievers, generating more personalized and factually accurate responses.