Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA

Nirmal Roy,Leonardo F. R. Ribeiro,Rexhina Blloshmi,Kevin Small

2024-09-24

Abstract:Augmenting Large Language Models (LLMs) with information retrieval capabilities (i.e., Retrieval-Augmented Generation (RAG)) has proven beneficial for knowledge-intensive tasks. However, understanding users' contextual search intent when generating responses is an understudied topic for conversational question answering (QA). This conversational extension leads to additional concerns when compared to single-turn QA as it is more challenging for systems to comprehend conversational context and manage retrieved passages over multiple turns. In this work, we propose a method for enabling LLMs to decide when to retrieve in RAG settings given a conversational context. When retrieval is deemed necessary, the LLM then rewrites the conversation for passage retrieval and judges the relevance of returned passages before response generation. Operationally, we build on the single-turn SELF-RAG framework (Asai et al., 2023) and propose SELF-multi-RAG for conversational settings. SELF-multi-RAG demonstrates improved capabilities over single-turn variants with respect to retrieving relevant passages (by using summarized conversational context) and assessing the quality of generated responses. Experiments on three conversational QA datasets validate the enhanced response generation capabilities of SELF-multi-RAG, with improvements of ~13% measured by human annotation.

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

The paper attempts to address the problem of how to enable large language models (LLMs) to decide whether to retrieve relevant information given the conversational context in multi-turn conversational question answering (CQA), and, if necessary, rewrite the conversation to retrieve relevant documents, evaluate the relevance of the returned documents, and generate high-quality answers. Specifically, the paper proposes the SELF-multi-RAG framework, which aims to: 1. **Understand the conversational context**: In multi-turn conversations, the system needs not only to understand the current question but also to consider the previous conversation history to determine whether new information needs to be retrieved or if the previously retrieved information can be directly used to generate an answer. 2. **Summarize the conversational context**: To improve retrieval effectiveness, this method trains the model to summarize the conversation history into a query, thereby more effectively retrieving relevant documents and avoiding the information loss problem that traditional single-question rewriting methods might cause. Through the above methods, SELF-multi-RAG can more accurately decide when to retrieve, how to rewrite queries, and how to generate answers based on retrieval results in multi-turn conversation scenarios, thereby improving the overall quality of answers. Experimental results show that this method has significant improvements over single-turn conversation methods on multiple CQA datasets.

Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

RAG based Question-Answering for Contextual Response Prediction System

Boosting Conversational Question Answering with Fine-Grained Retrieval-Augmentation and Self-Check

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains

Toward Optimal Search and Retrieval for RAG

Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models

Improving Retrieval for RAG based Question Answering Models on Financial Documents

Evaluating the Retrieval Component in LLM-Based Question Answering Systems

Enhancing Retrieval and Managing Retrieval: A Four-Module Synergy for Improved Quality and Efficiency in RAG Systems

Meta Knowledge for Retrieval Augmented Large Language Models

ActiveRAG: Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents

Adaptive Retrieval-Augmented Generation for Conversational Systems

Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG

Retrieval-Augmented Generation for Large Language Models: A Survey

Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models

RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems

Mindful-RAG: A Study of Points of Failure in Retrieval Augmented Generation