Redefining Information Retrieval of Structured Database via Large Language Models

Mingzhu Wang,Yuzhe Zhang,Qihang Zhao,Juanyi Yang,Hong Zhang
2024-05-09
Abstract:Retrieval augmentation is critical when Language Models (LMs) exploit non-parametric knowledge related to the query through external knowledge bases before reasoning. The retrieved information is incorporated into LMs as context alongside the query, enhancing the reliability of responses towards factual questions. Prior researches in retrieval augmentation typically follow a retriever-generator paradigm. In this context, traditional retrievers encounter challenges in precisely and seamlessly extracting query-relevant information from knowledge bases. To address this issue, this paper introduces a novel retrieval augmentation framework called ChatLR that primarily employs the powerful semantic understanding ability of Large Language Models (LLMs) as retrievers to achieve precise and concise information retrieval. Additionally, we construct an LLM-based search and question answering system tailored for the financial domain by fine-tuning LLM on two tasks including Text2API and API-ID recognition. Experimental results demonstrate the effectiveness of ChatLR in addressing user queries, achieving an overall information retrieval accuracy exceeding 98.8\%.
Information Retrieval,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve more accurate and concise information retrieval in structured databases. Specifically, existing information retrieval frameworks still have difficulty achieving satisfactory accuracy in practical applications even after fine - tuning for downstream tasks. The paper proposes a new retrieval - enhanced framework - ChatLR, which uses large language models (LLMs) as retrievers to generate precise database search commands, thereby improving the retrieval accuracy and efficiency of fact - querying in structured databases. ChatLR addresses the challenges encountered by traditional retrievers when precisely extracting query - related information from knowledge bases by mapping natural language queries to specific database search commands. In addition, ChatLR also constructs an LLM - based search and question - answering system, which is specifically optimized for the financial field. By fine - tuning the LLM on two tasks, namely Text2API and API - ID recognition, the performance of the system is further improved. Experimental results show that ChatLR performs excellently in handling user queries, with an overall information retrieval accuracy rate exceeding 98.8%.