A Distributed Collaborative Retrieval Framework Excelling in All Queries and Corpora based on Zero-shot Rank-Oriented Automatic Evaluation
Tian-Yi Che,Xian-Ling Mao,Chun Xu,Cheng-Xin Xin,Heng-Da Xu,Jin-Yu Liu,Heyan Huang
2024-12-16
Abstract:Numerous retrieval models, including sparse, dense and llm-based methods, have demonstrated remarkable performance in predicting the relevance between queries and corpora. However, the preliminary effectiveness analysis experiments indicate that these models fail to achieve satisfactory performance on the majority of queries and corpora, revealing their effectiveness restricted to specific scenarios. Thus, to tackle this problem, we propose a novel Distributed Collaborative Retrieval Framework (DCRF), outperforming each single model across all queries and corpora. Specifically, the framework integrates various retrieval models into a unified system and dynamically selects the optimal results for each user's query. It can easily aggregate any retrieval model and expand to any application scenarios, illustrating its flexibility and <a class="link-external link-http" href="http://scalability.Moreover" rel="external noopener nofollow">this http URL</a>, to reduce maintenance and training costs, we design four effective prompting strategies with large language models (LLMs) to evaluate the quality of ranks without reliance of labeled data. Extensive experiments demonstrate that proposed framework, combined with 8 efficient retrieval models, can achieve performance comparable to effective listwise methods like RankGPT and ListT5, while offering superior efficiency. Besides, DCRF surpasses all selected retrieval models on the most datasets, indicating the effectiveness of our prompting strategies on rank-oriented automatic evaluation.
Information Retrieval
What problem does this paper attempt to address?
This paper attempts to solve the problem that existing models cannot adapt to all scenarios when information retrieval (IR) systems are dealing with various queries and corpora. Although the existing sparse, dense, and large - language - model - (LLMs - ) based retrieval methods perform well in predicting the relevance between queries and corpora, preliminary effectiveness analysis experiments show that these models fail to achieve satisfactory performance on most queries and corpora, and their effectiveness is limited to specific scenarios.
To this end, the author proposes a novel Distributed Collaborative Retrieval Framework (DCRF), which aims to integrate multiple retrieval models to form a unified system and dynamically select the best results for each user query. Specifically, DCRF solves the problem in the following ways:
1. **Integrating multiple retrieval models**: Integrate different retrieval models (including sparse retrieval, dense retrieval, and LLM - based retrieval) into a unified system to ensure the flexibility and extensibility of the framework.
2. **Dynamically selecting the optimal result**: Dynamically select the optimal retrieval result according to the user's query to adapt to all queries and corpora.
3. **Zero - shot automatic evaluation**: Design four effective prompting strategies to use large - language models (LLMs) to evaluate ranking quality without relying on labeled data, thereby reducing maintenance and training costs.
In this way, DCRF can outperform a single model on all queries and corpora and provide a more efficient and general - purpose information retrieval solution.