Abstract:Instruction-tuned language models (LM) are able to respond to imperative commands, providing a more natural user interface compared to their base counterparts. In this work, we present Promptriever, the first retrieval model able to be prompted like an LM. To train Promptriever, we curate and release a new instance-level instruction training set from MS MARCO, spanning nearly 500k instances. Promptriever not only achieves strong performance on standard retrieval tasks, but also follows instructions. We observe: (1) large gains (reaching SoTA) on following detailed relevance instructions (+14.3 p-MRR / +3.1 nDCG on FollowIR), (2) significantly increased robustness to lexical choices/phrasing in the query+instruction (+12.9 Robustness@10 on InstructIR), and (3) the ability to perform hyperparameter search via prompting to reliably improve retrieval performance (+1.4 average increase on BEIR). Promptriever demonstrates that retrieval models can be controlled with prompts on a per-query basis, setting the stage for future work aligning LM prompting techniques with information retrieval.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the problems of flexibility and accuracy of information retrieval (IR) models when dealing with natural language instructions. Traditional information retrieval models usually match queries and documents based on a single semantic similarity score, which makes the user experience rather rigid. Users need to constantly adjust keywords or use advanced search settings to find the required documents. Specifically, the paper proposes **Promptriever**, a retrieval model that can be controlled by natural language prompts like a language model. Compared with traditional retrieval models, Promptriever can dynamically adjust its understanding of relevance according to specific natural language instructions, thus providing more flexible and accurate retrieval results. The following are the main problems that the paper attempts to solve: 1. **Enhancing the instruction - following ability of retrieval models**: - After standard IR training, traditional retrieval models lose their ability to respond to natural language instructions. By introducing an instruction dataset, the paper enables Promptriever to retain the instruction - following ability of its underlying language model during the training process. 2. **Improving the retrieval model's understanding of complex instructions**: - Promptriever can handle complex instructions, including detailed relevance definitions, and can optimize retrieval performance through prompts in a zero - sample situation. For example, a user can describe specific retrieval conditions in natural language, such as "Only retrieve James Cameron movies that were not co - directed before 2022". 3. **Increasing the robustness of the retrieval model**: - Promptriever shows stronger robustness to changes in query length and wording, reducing performance fluctuations caused by different query forms. For example, experiments on the BEIR dataset show that the variance of Promptriever is reduced by 44% and it is improved by 12.9% on the Robustness@10 metric. 4. **Achieving zero - sample prompt optimization**: - Promptriever can reliably improve retrieval performance through simple natural language prompts (such as "Carefully consider relevance and I will tip you"), which makes prompt engineering and automatic prompt methods possible. ### Summary By introducing Promptriever, the paper shows how modern dual - encoder retrieval models can be made to have natural language instruction - following ability through appropriate training data, thereby significantly improving retrieval performance and user experience. This method not only performs well in standard retrieval tasks but also reaches the state - of - the - art level in instruction - following tasks.

Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval

Learning To Retrieve Prompts for In-Context Learning

Prompting Is Programming: A Query Language for Large Language Models

RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning

The language of prompting: What linguistic properties make a prompt successful?

Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot Performance

Prompt2Model: Generating Deployable Models from Natural Language Instructions

Speechworthy Instruction-tuned Language Models

Exploring Lottery Prompts for Pre-trained Language Models

Enhance Performance of Ad-hoc Search via Prompt Learning.

Large Language Models Prompting With Episodic Memory

Effective Structured Prompting by Meta-Learning and Representative Verbalizer

Helping Language Models Learn More: Multi-dimensional Task Prompt for Few-shot Tuning

RankPrompt: Step-by-Step Comparisons Make Language Models Better Reasoners

Efficient Prompting Methods for Large Language Models: A Survey

Supervisory Prompt Training

ClickPrompt: CTR Models are Strong Prompt Generators for Adapting Language Models to CTR Prediction

Prompting Language Models for Linguistic Structure

Unveiling the Lexical Sensitivity of LLMs: Combinatorial Optimization for Prompt Enhancement