Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting

Zeyuan Chen,Haiyan Wu,Kaixin Wu,Wei Chen,Mingjie Zhong,Jia Xu,Zhongyi Liu,Wei Zhang
2024-08-18
Abstract:Relevance modeling is a critical component for enhancing user experience in search engines, with the primary objective of identifying items that align with users' queries. Traditional models only rely on the semantic congruence between queries and items to ascertain relevance. However, this approach represents merely one aspect of the relevance judgement, and is insufficient in isolation. Even powerful Large Language Models (LLMs) still cannot accurately judge the relevance of a query and an item from a semantic perspective. To augment LLMs-driven relevance modeling, this study proposes leveraging user interactions recorded in search logs to yield insights into users' implicit search intentions. The challenge lies in the effective prompting of LLMs to capture dynamic search intentions, which poses several obstacles in real-world relevance scenarios, i.e., the absence of domain-specific knowledge, the inadequacy of an isolated prompt, and the prohibitive costs associated with deploying LLMs. In response, we propose ProRBP, a novel Progressive Retrieved Behavior-augmented Prompting framework for integrating search scenario-oriented knowledge with LLMs effectively. Specifically, we perform the user-driven behavior neighbors retrieval from the daily search logs to obtain domain-specific knowledge in time, retrieving candidates that users consider to meet their expectations. Then, we guide LLMs for relevance modeling by employing advanced prompting techniques that progressively improve the outputs of the LLMs, followed by a progressive aggregation with comprehensive consideration of diverse aspects. For online serving, we have developed an industrial application framework tailored for the deployment of LLMs in relevance modeling. Experiments on real-world industry data and online A/B testing demonstrate our proposal achieves promising performance.
Information Retrieval,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to enhance the performance of large language models (LLMs) in relevance modeling in search scenarios. Specifically, traditional relevance modeling methods mainly rely on the semantic consistency between the query item and the target item to judge relevance. This method has limitations when used alone, especially for short and ambiguous query items and target items, and it is difficult to accurately capture the user's search intention. Although powerful large language models perform well in natural language processing tasks, it is still difficult to accurately judge the relevance between query items and target items only by semantic understanding. Especially in complex industrial search scenarios, the lack of domain - specific knowledge is a major obstacle. To solve these problems, the paper proposes a new framework - **ProRBP** (Progressive Retrieved Behavior - augmented Prompting), which obtains the user's implicit search intention by using user interaction data (recorded in search logs) and combines advanced prompting techniques to gradually improve the output of LLMs, thereby improving the accuracy of relevance modeling. This framework mainly contains two innovative modules: 1. **User - driven behavior neighbor retrieval**: Retrieve behavior neighbors that users consider to meet their expectations from the daily - updated search logs to obtain domain - specific knowledge. 2. **Progressive prompting and aggregation**: Design a series of prompts from simple to complex to gradually guide LLMs to make more sensitive judgments on relevance, and form a comprehensive relevance model by aggregating information from different aspects. In addition, the paper also explores how to efficiently deploy LLMs in industrial scenarios and proposes an offline and online collaborative service method to reduce costs and latency while maintaining the accuracy of relevance judgment. Experiments have proven that this framework performs better than existing relevance modeling methods on actual industry data.