A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models

Shengyao Zhuang,Honglei Zhuang,Bevan Koopman,Guido Zuccon
DOI: https://doi.org/10.1145/3626772.3657813
2024-05-30
Abstract:We propose a novel zero-shot document ranking approach based on Large Language Models (LLMs): the Setwise prompting approach. Our approach complements existing prompting approaches for LLM-based zero-shot ranking: Pointwise, Pairwise, and Listwise. Through the first-of-its-kind comparative evaluation within a consistent experimental framework and considering factors like model size, token consumption, latency, among others, we show that existing approaches are inherently characterised by trade-offs between effectiveness and efficiency. We find that while Pointwise approaches score high on efficiency, they suffer from poor effectiveness. Conversely, Pairwise approaches demonstrate superior effectiveness but incur high computational overhead. Our Setwise approach, instead, reduces the number of LLM inferences and the amount of prompt token consumption during the ranking procedure, compared to previous methods. This significantly improves the efficiency of LLM-based zero-shot ranking, while also retaining high zero-shot ranking effectiveness. We make our code and results publicly available at \url{<a class="link-external link-https" href="https://github.com/ielab/llm-rankers" rel="external noopener nofollow">this https URL</a>}.
Information Retrieval,Artificial Intelligence
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the trade - off between efficiency and effectiveness in zero - sample document ranking tasks for large language models (LLMs). Specifically, the paper focuses on the following points: 1. **Limitations of existing methods**: - **Pointwise methods**: They are efficient but less effective. - **Pairwise methods**: They are more effective but have high computational overhead and are low in efficiency. - **Listwise methods**: There is a compromise between efficiency and effectiveness, and they rely on generating the entire list of document labels, which is relatively slow in practical applications. 2. **Lack of fair comparison**: - In the existing literature, the effectiveness and efficiency of different LLM - based zero - sample ranking methods lack a fair comparison within a unified experimental framework. 3. **Proposing a new solution**: - The paper proposes a new set - based prompting approach (Setwise prompting approach) to improve the efficiency of zero - sample ranking while maintaining a high ranking effectiveness. ### Core idea of the solution - **Setwise Prompting**: By comparing multiple documents at once (instead of a pair of documents), the number of LLM inferences and prompt token consumption required are reduced. For example, in the heap sort algorithm, the traditional Pairwise method can only compare two documents at a time, while the Setwise method can compare multiple documents (such as 4) at once, thus significantly reducing the total number of comparisons. - **Ranking by combining Logits**: The Setwise method is applicable not only to Pairwise methods but also can improve Listwise methods. By using the logits output by the LLM to estimate the ranking possibility of document labels, the need to generate the entire list of document labels is avoided, thus improving efficiency. ### Experimental verification The paper proves the effectiveness and high efficiency of the Setwise method through extensive experiments, using the TREC Deep Learning 2019, 2020 and BEIR benchmark datasets, and testing under different LLM sizes. The experimental results show that the Setwise method performs excellently in the NDCG@10 metric, while significantly reducing the number of LLM inferences, input tokens and generated tokens, and reducing query latency. ### Main contributions 1. Proposing an innovative Setwise prompting method, which significantly improves the efficiency of zero - sample ranking while maintaining a high ranking effectiveness. 2. Systematically evaluating the existing LLM - based zero - sample ranking methods within a unified experimental framework, filling the gap in efficiency comparison in the literature. 3. Applying the Setwise method to Listwise methods, further enhancing their efficiency and effectiveness. Through these contributions, the paper provides valuable insights for choosing the most suitable LLM - based zero - sample ranking method for practical application scenarios.