Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization

Hamed Zamani,Michael Bendersky
2024-05-05
Abstract:This paper introduces Stochastic RAG--a novel approach for end-to-end optimization of retrieval-augmented generation (RAG) models that relaxes the simplifying assumptions of marginalization and document independence, made in most prior work. Stochastic RAG casts the retrieval process in RAG as a stochastic sampling without replacement process. Through this formulation, we employ straight-through Gumbel-top-k that provides a differentiable approximation for sampling without replacement and enables effective end-to-end optimization for RAG. We conduct extensive experiments on seven diverse datasets on a wide range of tasks, from open-domain question answering to fact verification to slot-filling for relation extraction and to dialogue systems. By applying this optimization method to a recent and effective RAG model, we advance state-of-the-art results on six out of seven datasets.
Computation and Language,Information Retrieval,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges encountered in achieving end - to - end optimization in the Retrieval - Augmented Generation (RAG) model. Specifically, existing RAG models usually rely on some simplified assumptions, such as marginalization (through top - 𝑘 approximation) and document independence, when performing optimization. Although these assumptions simplify the optimization process, they limit the space for performance improvement of the model. The paper proposes a new method - Stochastic RAG, which relaxes these assumptions by introducing the expected utility maximization framework, thereby achieving more effective end - to - end optimization. ### Main Contributions: 1. **Relaxing Simplified Assumptions**: Stochastic RAG avoids the marginalization and document independence assumptions in traditional methods by modeling the retrieval process as a random sampling process without replacement. 2. **Differentiable Sampling Method**: The straight - through Gumbel - top - k method is used to provide a differentiable approximation of sampling without replacement, enabling the RAG model to be optimized end - to - end. 3. **Extensive Experimental Verification**: Extensive experiments were carried out on seven different datasets, covering multiple tasks from open - domain question answering to fact verification. The results show that Stochastic RAG achieves state - of - the - art performance on most datasets. ### Technical Details: - **Expected Utility Maximization**: An expected utility function is defined to evaluate the quality of the generated text. This function can be any evaluation metric suitable for downstream generation tasks, such as exact match, BLEU or ROUGE. - **Probability Model**: The probability of generating output \( \hat{y} \) given input \( x \) is represented by the probability model \( p(\hat{y} | x; G_\theta, R_\phi) \), where \( G_\theta \) is the generation model and \( R_\phi \) is the retrieval model. - **Sampling without Replacement**: Gumbel noise and softmax operations are used to achieve a differentiable approximation of sampling without replacement, thus solving the non - differentiable problem. ### Experimental Results: - **Performance Improvement**: The experimental results on seven datasets show that Stochastic RAG achieves state - of - the - art results on six datasets, and is only slightly inferior to the GripRank method on the Wizard of Wikipedia dataset. - **Applicability to Different Model Sizes**: Both the smaller T5 - Base model and the larger T5 - XL model can benefit from Stochastic RAG, and the larger model performs better. ### Future Work: - **Long - Text Generation**: Research on the application of Stochastic RAG in long - text generation tasks. - **Diversity Enhancement**: Explore how to use the stochastic nature of Stochastic RAG to increase the diversity of generated outputs, especially in scenarios where human feedback needs to be collected. In conclusion, this paper successfully solves the key problems in the end - to - end optimization of the RAG model by introducing the Stochastic RAG method, providing new ideas for improving the generation quality and model performance.