SFR-RAG: Towards Contextually Faithful LLMs

Xuan-Phi Nguyen,Shrey Pandit,Senthil Purushwalkam,Austin Xu,Hailin Chen,Yifei Ming,Zixuan Ke,Silvio Savarese,Caiming Xong,Shafiq Joty

2024-09-16

Abstract:Retrieval Augmented Generation (RAG), a paradigm that integrates external contextual information with large language models (LLMs) to enhance factual accuracy and relevance, has emerged as a pivotal area in generative AI. The LLMs used in RAG applications are required to faithfully and completely comprehend the provided context and users' questions, avoid hallucination, handle unanswerable, counterfactual or otherwise low-quality and irrelevant contexts, perform complex multi-hop reasoning and produce reliable citations. In this paper, we introduce SFR-RAG, a small LLM that is instruction-tuned with an emphasis on context-grounded generation and hallucination minimization. We also present ContextualBench, a new evaluation framework compiling multiple popular and diverse RAG benchmarks, such as HotpotQA and TriviaQA, with consistent RAG settings to ensure reproducibility and consistency in model assessments. Experimental results demonstrate that our SFR-RAG-9B model outperforms leading baselines such as Command-R+ (104B) and GPT-4o, achieving state-of-the-art results in 3 out of 7 benchmarks in ContextualBench with significantly fewer parameters. The model is also shown to be resilient to alteration in the contextual information and behave appropriately when relevant context is removed. Additionally, the SFR-RAG model maintains competitive performance in general instruction-following tasks and function-calling capabilities.

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

The problem that this paper attempts to solve is, in the Retrieval - Augmented Generation (RAG) framework, how to make large language models (LLMs) understand and utilize external context information more reliably, so as to improve the factual accuracy, relevance and reliability of the generated answers. Specifically, the paper focuses on the following aspects: 1. **Reducing hallucinations**: that is, avoiding LLMs from generating inaccurate or fact - inconsistent content without sufficient context support. 2. **Handling low - quality or irrelevant context**: when the provided context information is of low quality or irrelevant to the question, the model should be able to recognize and respond appropriately. 3. **Multi - hop reasoning ability**: the model needs to be able to perform complex logical reasoning among multiple context fragments to generate accurate answers. 4. **Citation ability**: the model should be able to reliably cite sources in the context, increasing the credibility of the answers. To solve the above problems, the paper introduces the SFR - RAG model, which is a small, specifically tuned LLM aimed at enhancing its context - understanding ability in RAG applications and reducing hallucination phenomena. In addition, the paper also proposes a new evaluation framework - ContextualBench, which is used to evaluate the performance of RAG models in a standardized setting, ensuring the repeatability and consistency of evaluation results.

SFR-RAG: Towards Contextually Faithful LLMs

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

Sufficient Context: A New Lens on Retrieval Augmented Generation Systems

Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection

IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation

SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains

Lighter And Better: Towards Flexible Context Adaptation For Retrieval Augmented Generation

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

Not All Contexts Are Equal: Teaching LLMs Credibility-aware Generation

In Defense of RAG in the Era of Long-Context Language Models

Retrieval Augmented Generation or Long-Context LLMs? A Comprehensive Study and Hybrid Approach

Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation

RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems

Corrective Retrieval Augmented Generation

Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation

FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering