Re3val: Reinforced and Reranked Generative Retrieval

EuiYul Song,Sangryul Kim,Haeju Lee,Joonkee Kim,James Thorne

2024-02-23

Abstract:Generative retrieval models encode pointers to information in a corpus as an index within the model's parameters. These models serve as part of a larger pipeline, where retrieved information conditions generation for knowledge-intensive NLP tasks. However, we identify two limitations: the generative retrieval does not account for contextual information. Secondly, the retrieval can't be tuned for the downstream readers as decoding the page title is a non-differentiable operation. This paper introduces Re3val, trained with generative reranking and reinforcement learning using limited data. Re3val leverages context acquired via Dense Passage Retrieval to rerank the retrieved page titles and utilizes REINFORCE to maximize rewards generated by constrained decoding. Additionally, we generate questions from our pre-training dataset to mitigate epistemic uncertainty and bridge the domain gap between the pre-training and fine-tuning datasets. Subsequently, we extract and rerank contexts from the KILT database using the rerank page titles. Upon grounding the top five reranked contexts, Re3val demonstrates the Top 1 KILT scores compared to all other generative retrieval models across five KILT datasets.

Information Retrieval

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on two key limitations of generative retrieval models: 1. **Insufficient context information**: Existing generative retrieval models do not fully consider context information during the retrieval process, which may lead to a low correlation between the retrieved information and the query. 2. **Difficulty in optimizing downstream tasks**: Since decoding page titles is a non - differentiable operation, existing generative retrieval models cannot be optimized for downstream tasks, thus affecting the performance of the model. To solve these problems, the paper proposes the **Re3val** model, which improves generative retrieval through the following methods: - **Generative re - ranking**: Use the context information obtained by Dense Passage Retrieval (DPR) to re - rank the generated page titles to improve relevance. - **Reinforcement learning**: Use the REINFORCE algorithm to optimize the reward signal in the generative retrieval process, thereby enhancing the relevance of the model. - **Question generation**: Generate questions from the pre - training dataset to reduce prior uncertainty and bridge the domain gap between the pre - training and fine - tuning datasets. Through these improvements, Re3val performs well in multiple benchmark tests, especially outperforming other generative retrieval models on the KILT dataset. Specifically, Re3val has an average R - Precision improvement of 1.9% on five KILT datasets, and also achieves significant performance improvements in zero - shot and few - shot retrieval tasks.

Re3val: Reinforced and Reranked Generative Retrieval

Learning to Rank in Generative Retrieval

RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation

Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering.

Distillation Enhanced Generative Retrieval

Generative Relevance Feedback and Convergence of Adaptive Re-Ranking: University of Glasgow Terrier Team at TREC DL 2023

GripRank: Bridging the Gap between Retrieval and Generation via the Generative Knowledge Improved Passage Ranking

Enhancing Q&A Text Retrieval with Ranking Models: Benchmarking, fine-tuning and deploying Rerankers for RAG

SC-Rec: Enhancing Generative Retrieval with Self-Consistent Reranking for Sequential Recommendation

Generative Retrieval Meets Multi-Graded Relevance

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

A Modern Perspective on Query Likelihood with Deep Generative Retrieval Models

ReFIT: Relevance Feedback from a Reranker during Inference

Scalable and Effective Generative Information Retrieval

List-aware Reranking-Truncation Joint Model for Search and Retrieval-augmented Generation

Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy

Non-autoregressive Generative Models for Reranking Recommendation

Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval

KC-GenRe: A Knowledge-constrained Generative Re-ranking Method Based on Large Language Models for Knowledge Graph Completion

ICLERB: In-Context Learning Embedding and Reranker Benchmark