LLM-Augmented Retrieval: Enhancing Retrieval Models Through Language Models and Doc-Level Embedding

Mingrui Wu,Sheng Cao
2024-04-09
Abstract:Recently embedding-based retrieval or dense retrieval have shown state of the art results, compared with traditional sparse or bag-of-words based approaches. This paper introduces a model-agnostic doc-level embedding framework through large language model (LLM) augmentation. In addition, it also improves some important components in the retrieval model training process, such as negative sampling, loss function, etc. By implementing this LLM-augmented retrieval framework, we have been able to significantly improve the effectiveness of widely-used retriever models such as Bi-encoders (Contriever, DRAGON) and late-interaction models (ColBERTv2), thereby achieving state-of-the-art results on LoTTE datasets and BEIR datasets.
Information Retrieval,Artificial Intelligence
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper aims to improve the performance of information retrieval systems by introducing a new framework—LLM-Augmented Retrieval. Specifically, the paper addresses the following issues: 1. **Richness of Document Embeddings**: - Proposes a model-agnostic document-level embedding framework that enhances the contextual information of documents through large language models (LLM), thereby improving the quality and robustness of existing retrievers. 2. **Improvement of the Retrieval Model Training Process**: - Enhances several key components in the training process of retrieval models, such as negative sample sampling and loss functions. 3. **Support for Multiple Model Architectures**: - The framework is not only applicable to Bi-encoders (e.g., Contriever, DRAGON) but also to late interaction models (e.g., ColBERTv2), achieving state-of-the-art results on various datasets. Through these improvements, the LLM-Augmented Retrieval framework significantly enhances the effectiveness of commonly used retriever models (such as Bi-encoders and late interaction models) and achieves the best performance on the LoTTE and BEIR datasets.