ERDL: Efficient Retrieval Framework Based on Distillation from Large Language Models

Heng Yu,Rui Li,Zheng Zhang,Shengyu Ye,Qi Liu,Zhenya Huang,Enhong Chen
DOI: https://doi.org/10.1109/clnlp64123.2024.00023
2024-01-01
Abstract:The aim of information retrieval is to efficiently retrieve the most relevant information based on user queries. With the rise of pre-trained language models (such as BERT, GPT, etc.), researchers have begun to utilize the dense vector representation capabilities of pre-trained language models, proposing dense retrieval methods to better capture semantic information. The emergence of large language models (such as ChatGPT) has prompted researchers to start exploring the application of these models in actual retrieval tasks. However, the introduction of large language models also brings about an increase in computational and storage costs. To address the issues brought about by large language models, this study proposes an Efficient Retrieval framework based on Distillation from Large language models(ERDL). This framework initially enhances the representational capacity of the encoder-only model by means of knowledge distillation from large language models, improving the accuracy and relevance of retrieval while maintaining the efficiency advantage of the encoder-only model (which is typically smaller in size). Then, it utilizes the encoding capabilities of the large language model to compensate for the missing information in the encoder-only model’s representation, further enhancing the performance of the encoder-only model through contrastive learning supervised by the large language model. Experimental results indicate that our method, compared to large language models, has achieved significant improvements in the three real-world datasets of the MTEB information retrieval task. While ensuring that the encoder-only model has competitive retrieval results, our method has improved the retrieval speed by over 85%, effectively reducing computational costs.
What problem does this paper attempt to address?