Modular Retrieval for Generalization and Interpretation

Juhao Liang,Chen Zhang,Zhengyang Tang,Jie Fu,Dawei Song,Benyou Wang

2023-03-24

Abstract:New retrieval tasks have always been emerging, thus urging the development of new retrieval models. However, instantiating a retrieval model for each new retrieval task is resource-intensive and time-consuming, especially for a retrieval model that employs a large-scale pre-trained language model. To address this issue, we shift to a novel retrieval paradigm called modular retrieval, which aims to solve new retrieval tasks by instead composing multiple existing retrieval modules. Built upon the paradigm, we propose a retrieval model with modular prompt tuning named REMOP. It constructs retrieval modules subject to task attributes with deep prompt tuning, and yields retrieval models subject to tasks with module composition. We validate that, REMOP inherently with modularity not only has appealing generalizability and interpretability in preliminary explorations, but also achieves comparable performance to state-of-the-art retrieval models on a zero-shot retrieval benchmark.\footnote{Our code is available at \url{<a class="link-external link-https" href="https://github.com/FreedomIntelligence/REMOP" rel="external noopener nofollow">this https URL</a>}}

Information Retrieval

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve The paper aims to address the following issues: 1. **Resource-intensive and time-consuming nature of new retrieval tasks**: As new retrieval tasks continuously emerge, building a retrieval model for each new task individually becomes resource-intensive and time-consuming, especially when using large-scale pre-trained language models (PLMs). 2. **Improving the generalization ability and interpretability of models**: By proposing a new retrieval paradigm—modular retrieval, this method addresses new tasks by combining multiple existing retrieval modules, thereby enhancing the generalization ability and interpretability of the retrieval process. Specifically, the paper proposes a new method called REMOP (REtrieval with MOdular Prompt Tuning), which utilizes Deep Prompt Tuning (DPT) technology to construct retrieval modules tailored to task attributes, and combines these modules to obtain a retrieval model for specific tasks. Experimental results show that REMOP not only performs well on zero-shot retrieval benchmarks but also has good generalization ability and interpretability.

Modular Retrieval for Generalization and Interpretation

UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation

Generalization Properties of Retrieval-based Models

ControlRetriever: Harnessing the Power of Instructions for Controllable Retrieval

ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information Retrieval

Meta-prompting Optimized Retrieval-augmented Generation

LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant

Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers

Retrieval-Enhanced Visual Prompt Learning for Few-shot Classification

NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework

Look Before You Leap: A Universal Emergent Decomposition of Retrieval Tasks in Language Models

Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning

A Unified Generative Retriever for Knowledge-Intensive Language Tasks via Prompt Learning

Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning

Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy

Retrieval-Enhanced Machine Learning: Synthesis and Opportunities

Reliable, Adaptable, and Attributable Language Models with Retrieval

Continual Referring Expression Comprehension Via Dual Modular Memorization.

Reconstruct before Query: Continual Missing Modality Learning with Decomposed Prompt Collaboration

Enhancing Retrieval and Managing Retrieval: A Four-Module Synergy for Improved Quality and Efficiency in RAG Systems

Towards Generalist Prompting for Large Language Models by Mental Models