Context-augmented Retrieval: A Novel Framework for Fast Information Retrieval based Response Generation using Large Language Model

Sai Ganesh,Anupam Purwar,Gautam B

2024-06-24

Abstract:Generating high-quality answers consistently by providing contextual information embedded in the prompt passed to the Large Language Model (LLM) is dependent on the quality of information retrieval. As the corpus of contextual information grows, the answer/inference quality of Retrieval Augmented Generation (RAG) based Question Answering (QA) systems declines. This work solves this problem by combining classical text classification with the Large Language Model (LLM) to enable quick information retrieval from the vector store and ensure the relevancy of retrieved information. For the same, this work proposes a new approach Context Augmented retrieval (CAR), where partitioning of vector database by real-time classification of information flowing into the corpus is done. CAR demonstrates good quality answer generation along with significant reduction in information retrieval and answer generation time.

Information Retrieval

What problem does this paper attempt to address?

The paper attempts to address the issue of declining quality in information retrieval and answer generation in Retrieval-Augmented Generation (RAG) question-answering systems as the size of large knowledge bases increases. Specifically, as the knowledge base grows, RAG systems become inefficient in retrieving relevant documents and generating high-quality answers, leading to increased response and retrieval times. To solve this problem, the paper proposes a new framework called Context Augmented Retrieval (CAR). CAR combines traditional text classification methods with large language models (LLMs) to achieve fast information retrieval while ensuring the relevance of the retrieved information. The specific approach includes: 1. **Query Classification**: Using a classification model to classify user queries in real-time, categorizing them into relevant domains or categories. 2. **Index Loading**: Loading indexes of specific domains based on the classification results to retrieve context information relevant to the user query. 3. **Hybrid Retriever**: Combining BM25 retriever and vector retriever to efficiently retrieve relevant information from the indexes. 4. **Query Engine**: Passing the retrieved context information along with the user query to the large language model to generate coherent and information-rich answers. Through these steps, CAR not only improves the efficiency of information retrieval but also significantly reduces the time for information retrieval and answer generation while maintaining the quality of the answers.

Context-augmented Retrieval: A Novel Framework for Fast Information Retrieval based Response Generation using Large Language Model

RAG based Question-Answering for Contextual Response Prediction System

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection

Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey

Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA

Context Awareness Gate For Retrieval Augmented Generation

A Multi-Source Retrieval Question Answering Framework Based on RAG

Context Tuning for Retrieval Augmented Generation

In Defense of RAG in the Era of Long-Context Language Models

SMART-RAG: Selection using Determinantal Matrices for Augmented Retrieval

DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented Generation for Question-Answering

RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems

Retrieval-Augmented Generation for Large Language Models: A Survey

PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter

LightRAG: Simple and Fast Retrieval-Augmented Generation

ALR$^2$: A Retrieve-then-Reason Framework for Long-context Question Answering

Retrieval Meets Reasoning: Dynamic In-Context Editing for Long-Text Understanding

A Survey on Retrieval-Augmented Text Generation for Large Language Models

Boosting Conversational Question Answering with Fine-Grained Retrieval-Augmentation and Self-Check

Lighter And Better: Towards Flexible Context Adaptation For Retrieval Augmented Generation