A Survey on Employing Large Language Models for Text-to-SQL Tasks

Liang Shi,Zhengju Tang,Nan Zhang,Xiaotong Zhang,Zhi Yang
2024-08-11
Abstract:The increasing volume of data stored in relational databases has led to the need for efficient querying and utilization of this data in various sectors. However, writing SQL queries requires specialized knowledge, which poses a challenge for non-professional users trying to access and query databases. Text-to-SQL parsing solves this issue by converting natural language queries into SQL queries, thus making database access more accessible for non-expert users. To take advantage of the recent developments in Large Language Models (LLMs), a range of new methods have emerged, with a primary focus on prompt engineering and fine-tuning. This survey provides a comprehensive overview of LLMs in text-to-SQL tasks, discussing benchmark datasets, prompt engineering, fine-tuning methods, and future research directions. We hope this review will enable readers to gain a broader understanding of the recent advances in this field and offer some insights into its future trajectory.
Computation and Language
What problem does this paper attempt to address?
The paper primarily explores the application of large language models (LLMs) in the Text-to-SQL task and provides a comprehensive overview. Specifically, the paper attempts to address the following key issues: 1. **Enhancing the ability of non-expert users to access databases**: As the amount of data stored in relational databases continues to increase, the need for efficient querying and utilization of this data becomes increasingly urgent. However, writing SQL queries requires professional knowledge, which poses a barrier for non-experts. By converting natural language queries into SQL queries, Text-to-SQL parsing technology can simplify database access. 2. **Leveraging the advantages of large language models**: Given the recent development of large language models, researchers have developed a series of new methods, focusing on prompt engineering and fine-tuning. This paper reviews the application of these methods in the Text-to-SQL task. 3. **Providing a comprehensive survey of the Text-to-SQL field**: This paper aims to provide a comprehensive survey on the use of large language models for the Text-to-SQL task. The discussion includes benchmark datasets, prompt engineering techniques, fine-tuning methods, and future research directions. The authors hope that this review will help readers better understand the latest advancements in this field and provide some insights for future research. In summary, the goal of this paper is to enhance the technology for converting natural language to SQL queries by investigating the application of large language models in the Text-to-SQL task, thereby improving non-expert users' access to databases.