ReasoningRank: Teaching Student Models to Rank through Reasoning-Based Knowledge Distillation

Yuelyu Ji,Zhuochun Li,Rui Meng,Daqing He
2024-10-08
Abstract:Reranking documents based on their relevance to a given query is critical in information retrieval. Traditional reranking methods often focus on improving the initial rankings but lack transparency, failing to explain why one document is ranked higher. In this paper, we introduce ReasoningRank, a novel reranking approach that enhances clarity by generating two types of reasoning: explicit reasoning, which explains how a document addresses the query, and comparison reasoning, which justifies the relevance of one document over another. We leverage large language models (LLMs) as teacher models to generate these explanations and distill this knowledge into smaller, more resource-efficient student models. While the student models may not outperform LLMs in speed, they significantly reduce the computational burden by requiring fewer resources, making them more suitable for large-scale or resource-constrained settings. These student models are trained to both generate meaningful reasoning and rerank documents, achieving competitive performance across multiple datasets, including MSMARCO and BRIGHT. Experiments demonstrate that ReasoningRank improves reranking accuracy and provides valuable insights into the decision-making process, offering a structured and interpretable solution for reranking tasks.
Computation and Language
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve The paper aims to address the issue of document re-ranking in information retrieval, specifically how to improve the transparency and interpretability of re-ranking. Traditional re-ranking methods, while capable of improving initial rankings, often lack transparency and fail to explain why one document is ranked above another. This paper proposes **ReasoningRank**, a new re-ranking method that enhances transparency by generating two types of reasoning (explicit reasoning and comparative reasoning): 1. **Explicit Reasoning**: Explains how a document directly responds to a query, focusing on the relevance and specificity of the content. 2. **Comparative Reasoning**: Assesses the relative relevance between documents, explaining why one document should be ranked above another. The authors utilize large language models (LLMs) as teacher models to generate these explanations and transfer this knowledge to smaller, more efficient student models through knowledge distillation. Although student models may not match the speed of LLMs, they significantly reduce the demand for computational resources, making them more suitable for large-scale or resource-constrained environments. Experimental results show that ReasoningRank not only improves re-ranking accuracy but also provides valuable insights into the decision-making process, offering a structured and interpretable solution for re-ranking tasks.