TELLMe: Teaching and Exploiting Large Language Models for Model Selection in Text Retrieval

Zhenzi Li,Jun Bai,Zhuofan Chen,Chen Li,Yuanxin Ouyang,Wenge Rong
DOI: https://doi.org/10.1109/ijcnn60899.2024.10651417
2024-01-01
Abstract:Text retrieval, a pivotal application in Natural Language Processing (NLP), involves retrieving pertinent documents from a candidate pool for a certain query. Due to the outstanding performance and improved convergence, Pre-trained Language Models (PLMs) are extensively employed in text retrieval. However, selecting the prime model from a multitude of PLMs remains a challenge, termed Model Selection (MS). In this paper, we address the MS by proposing a novel two-stage approach, Teaching and Exploiting Large Language Models (TELLMe). In the first stage, we efficiently retrieve models through Large Language Models (LLMs) prompting, leveraging the unparalleled world knowledge of LLM, where In-Context Learning (ICL) and Chain-of-Thought (CoT) are incorporated to overcome the inherent challenges in LLMs’ understanding of MS tasks. After retrieving potential high-performing models, the second stage employs Transferability Estimation (TE) for further model ranking. To tackle the issue of untrue hard labels faced by TE approaches, we propose a novel method, Evidence maximized from Soft labels (EaSe), which utilizes soft labels from LLM-encoded query embeddings and document embeddings to evaluate transferability scores. Our proposed approach combines the exceptional capabilities of LLMs and provides an efficient and generalized solution for MS regarding text retrieval. We further systematically evaluate the TELLMe framework on three diverse datasets, demonstrating its effectiveness and superiority.
What problem does this paper attempt to address?