Toward Optimal Search and Retrieval for RAG

Alexandria Leto,Cecilia Aguerrebere,Ishwar Bhati,Ted Willke,Mariano Tepper,Vy Ai Vo
2024-11-12
Abstract:Retrieval-augmented generation (RAG) is a promising method for addressing some of the memory-related challenges associated with Large Language Models (LLMs). Two separate systems form the RAG pipeline, the retriever and the reader, and the impact of each on downstream task performance is not well-understood. Here, we work towards the goal of understanding how retrievers can be optimized for RAG pipelines for common tasks such as Question Answering (QA). We conduct experiments focused on the relationship between retrieval and RAG performance on QA and attributed QA and unveil a number of insights useful to practitioners developing high-performance RAG pipelines. For example, lowering search accuracy has minor implications for RAG performance while potentially increasing retrieval speed and memory efficiency.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to optimize the retriever in the Retrieval - Augmented Generation (RAG) system to improve the performance of downstream tasks (such as question - answering tasks). Specifically, the author has studied the role of the retriever in the RAG pipeline, especially how the performance of the retriever affects the final question - answering task performance. The paper experimentally explores the following aspects: 1. **The influence of the number of retrieved documents**: Research on the influence of different numbers of retrieved documents on the performance of question - answering tasks. It is found that increasing the number of retrieved documents can improve performance, but the effect tends to be saturated after exceeding a certain number. 2. **The influence of the recall rate of gold documents**: Analyzed the position of gold documents (i.e., documents containing correct answers) in the retrieval results and their influence on task performance. It is found that even a small number of gold documents can significantly improve performance. 3. **The influence of the approximate nearest neighbor search (ANN) precision**: Researched the influence of reducing ANN search precision on task performance. It is found that reducing search precision has little influence on performance, but can significantly improve retrieval speed and memory efficiency. 4. **The influence of noisy documents**: Explored the influence of adding noisy documents with different degrees of relevance in the retrieval results on task performance. It is found that regardless of the relevance of noisy documents, it will lead to performance degradation. Through these studies, the author hopes to provide guidance for designing high - performance RAG pipelines, especially in the selection and optimization of retrievers.