Large Search Model: Redefining Search Stack in the Era of LLMs

Liang Wang,Nan Yang,Xiaolong Huang,Linjun Yang,Rangan Majumder,Furu Wei
DOI: https://doi.org/10.48550/arXiv.2310.14587
2024-01-02
Abstract:Modern search engines are built on a stack of different components, including query understanding, retrieval, multi-stage ranking, and question answering, among others. These components are often optimized and deployed independently. In this paper, we introduce a novel conceptual framework called large search model, which redefines the conventional search stack by unifying search tasks with one large language model (LLM). All tasks are formulated as autoregressive text generation problems, allowing for the customization of tasks through the use of natural language prompts. This proposed framework capitalizes on the strong language understanding and reasoning capabilities of LLMs, offering the potential to enhance search result quality while simultaneously simplifying the existing cumbersome search stack. To substantiate the feasibility of this framework, we present a series of proof-of-concept experiments and discuss the potential challenges associated with implementing this approach within real-world search systems.
Information Retrieval,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in modern search engines, each component (such as query understanding, retrieval, multi - stage ranking, question - answering, etc.) is usually optimized and deployed independently, resulting in a complex and difficult - to - maintain search stack. In addition, for long - tail and complex user information needs, the quality of existing search results still needs to be improved. The paper proposes a new conceptual framework - the Large Search Model, aiming to redefine the traditional search stack by uniformly using a large - language model (LLM). This model regards all search tasks as autoregressive text generation problems and customizes tasks through natural - language prompts, thereby leveraging the powerful language understanding and reasoning capabilities of LLM, simplifying the existing complex search stack and simultaneously improving the quality of search results. Specifically, the main contributions of the paper include: 1. **Proposing a conceptual framework**: Introducing the concept of "Large Search Model", which is a unified search - task - processing framework based on LLM. 2. **Experimental verification**: Through a series of proof - of - concept experiments, demonstrating the feasibility of this framework. 3. **Challenges and future research directions**: Discussing the challenges that may be faced in implementing this framework in actual search systems, such as inference efficiency, long - context modeling, multi - modal support, etc., and calling for further research to solve these problems. Through these contributions, the paper hopes to promote the development of search engine technology, making it more efficient, flexible and user - friendly.