Abstract:Large Language Models (LLMs) has shown exceptional capabilities in many natual language understanding and generation tasks. However, the personalization issue still remains a much-coveted property, especially when it comes to the multiple sources involved in the dialogue system. To better plan and incorporate the use of multiple sources in generating personalized response, we firstly decompose it into three sub-tasks: Knowledge Source Selection, Knowledge Retrieval, and Response Generation. We then propose a novel Unified Multi-Source Retrieval-Augmented Generation system (UniMS-RAG) Specifically, we unify these three sub-tasks with different formulations into the same sequence-to-sequence paradigm during the training, to adaptively retrieve evidences and evaluate the relevance on-demand using special tokens, called acting tokens and evaluation tokens. Enabling language models to generate acting tokens facilitates interaction with various knowledge sources, allowing them to adapt their behavior to diverse task requirements. Meanwhile, evaluation tokens gauge the relevance score between the dialogue context and the retrieved evidence. In addition, we carefully design a self-refinement mechanism to iteratively refine the generated response considering 1) the consistency scores between the generated response and retrieved evidence; and 2) the relevance scores. Experiments on two personalized datasets (DuLeMon and KBP) show that UniMS-RAG achieves state-of-the-art performance on the knowledge source selection and response generation task with itself as a retriever in a unified manner. Extensive analyses and discussions are provided for shedding some new perspectives for personalized dialogue systems.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the task of personalized knowledge - enhanced response generation (PerDS) in dialogue systems, especially when multiple knowledge sources are involved. Specifically, existing knowledge - enhanced dialogue systems either focus only on a single knowledge source or use all knowledge sources indiscriminately, ignoring the possible existence of multiple knowledge sources and the complex relationships between them in actual scenarios. In addition, existing methods usually train the retriever and the reader independently, resulting in sub - optimal performance and distribution differences between the retriever and the reader; or design complex architectures to optimize them simultaneously, which is impractical in the era of large - language models due to high computational costs. To address the above challenges, the authors first decompose the task of personalized knowledge - enhanced dialogue response generation into three distinct subtasks: 1. **Knowledge Source Selection** (Planner): Plan the order of knowledge source invocation according to the dialogue context, considering the independent or dependent relationships between different sources. 2. **Knowledge Retrieval** (Retriever): Sequentially retrieve the top n pieces of evidence from external sources according to the decision in the previous step. 3. **Response Generation** (Reader): Generate a knowledge - enhanced natural - language response based on the original dialogue context and the retrieved evidence. Then, the authors design a new framework - the Unified Multi - Source Retrieval - Augmented Dialogue System (UniMS - RAG), which unifies these three tasks in a sequence - to - sequence (Seq2Seq) manner using the same large - language model. By introducing two special tokens - acting tokens and evaluation tokens, UniMS - RAG can adaptively retrieve evidence and evaluate relevance, thereby reformulating the above three subtasks as token - prediction tasks during the training process. In addition, the authors also design a self - refinement mechanism that re - evaluates the generated response by leveraging the feedback from the evaluation tokens during the inference stage, ensuring that the generated response is consistent with the provided evidence and has a high relevance score. Experimental results show that UniMS - RAG outperforms previous strong baseline models on two personalized datasets (DuLeMon and KBP), and reaches a new state - of - the - art level when using more advanced external retrievers, generating more personalized and factually accurate responses.

UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems

UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models

Multitask Learning and Reinforcement Learning for Personalized Dialog Generation: an Empirical Study.

IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

Large Language Models as Source Planner for Personalized Knowledge-grounded Dialogue

ERAGent: Enhancing Retrieval-Augmented Language Models with Improved Accuracy, Efficiency, and Personalization

UniDU: Towards A Unified Generative Dialogue Understanding Framework

Generating Personalized Dialogue via Multi-Task Meta-Learning

Towards Multi-Source Retrieval-Augmented Generation via Synergizing Reasoning and Preference-Driven Retrieval

UniRQR: A Unified Model for Retrieval Decision, Query, and Response Generation in Internet-Based Knowledge Dialogue Systems

UniMC: A Unified Framework for Long-Term Memory Conversation via Relevance Representation Learning

UniRaG: Unification, Retrieval, and Generation for Multimodal Question Answering With Pre-Trained Language Models

Retrieval-Augmented Personalization for Multimodal Large Language Models

UniGen: A Unified Generative Framework for Retrieval and Question Answering with Large Language Models

Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

Towards a Unified Multi-Dimensional Evaluator for Text Generation.

Retrieval-Augmented Generation for Large Language Models: A Survey

Improving Open-Domain Dialogue Response Generation with Multi-Source Multilingual Commonsense Knowledge

More is Better: Enhancing Open-Domain Dialogue Generation via Multi-Source Heterogeneous Knowledge