Abstract:The aim of information retrieval is to efficiently retrieve the most relevant information based on user queries. With the rise of pre-trained language models (such as BERT, GPT, etc.), researchers have begun to utilize the dense vector representation capabilities of pre-trained language models, proposing dense retrieval methods to better capture semantic information. The emergence of large language models (such as ChatGPT) has prompted researchers to start exploring the application of these models in actual retrieval tasks. However, the introduction of large language models also brings about an increase in computational and storage costs. To address the issues brought about by large language models, this study proposes an Efficient Retrieval framework based on Distillation from Large language models(ERDL). This framework initially enhances the representational capacity of the encoder-only model by means of knowledge distillation from large language models, improving the accuracy and relevance of retrieval while maintaining the efficiency advantage of the encoder-only model (which is typically smaller in size). Then, it utilizes the encoding capabilities of the large language model to compensate for the missing information in the encoder-only model’s representation, further enhancing the performance of the encoder-only model through contrastive learning supervised by the large language model. Experimental results indicate that our method, compared to large language models, has achieved significant improvements in the three real-world datasets of the MTEB information retrieval task. While ensuring that the encoder-only model has competitive retrieval results, our method has improved the retrieval speed by over 85%, effectively reducing computational costs.

ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder Via Self On-the-fly Distillation for Dense Passage Retrieval

ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval

Translate-Distill: Learning Cross-Language Dense Retrieval by Translation and Distillation

Towards Better Entity Linking with Multi-View Enhanced Distillation

Query Encoder Distillation via Embedding Alignment is a Strong Baseline Method to Boost Dense Retriever Online Efficiency

Distilled Dual-Encoder Model for Vision-Language Understanding

A Multi-level Distillation based Dense Passage Retrieval Model

Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval

How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval?

EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval

ERDL: Efficient Retrieval Framework Based on Distillation from Large Language Models

WIDER & CLOSER: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition

Training Task Experts through Retrieval Based Distillation

XtremeDistil: Multi-stage Distillation for Massive Multilingual Models

Learning Cross-Lingual IR from an English Retriever

Dual teachers for self-knowledge distillation

LEAD: Liberal Feature-based Distillation for Dense Retrieval

Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System

TAS: Distilling Arbitrary Teacher and Student via a Hybrid Assistant

Exploring Dual Encoder Architectures for Question Answering

Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval