Abstract:The increasing volume of data stored in relational databases has led to the need for efficient querying and utilization of this data in various sectors. However, writing SQL queries requires specialized knowledge, which poses a challenge for non-professional users trying to access and query databases. Text-to-SQL parsing solves this issue by converting natural language queries into SQL queries, thus making database access more accessible for non-expert users. To take advantage of the recent developments in Large Language Models (LLMs), a range of new methods have emerged, with a primary focus on prompt engineering and fine-tuning. This survey provides a comprehensive overview of LLMs in text-to-SQL tasks, discussing benchmark datasets, prompt engineering, fine-tuning methods, and future research directions. We hope this review will enable readers to gain a broader understanding of the recent advances in this field and offer some insights into its future trajectory.

What problem does this paper attempt to address?

The paper primarily explores the application of large language models (LLMs) in the Text-to-SQL task and provides a comprehensive overview. Specifically, the paper attempts to address the following key issues: 1. **Enhancing the ability of non-expert users to access databases**: As the amount of data stored in relational databases continues to increase, the need for efficient querying and utilization of this data becomes increasingly urgent. However, writing SQL queries requires professional knowledge, which poses a barrier for non-experts. By converting natural language queries into SQL queries, Text-to-SQL parsing technology can simplify database access. 2. **Leveraging the advantages of large language models**: Given the recent development of large language models, researchers have developed a series of new methods, focusing on prompt engineering and fine-tuning. This paper reviews the application of these methods in the Text-to-SQL task. 3. **Providing a comprehensive survey of the Text-to-SQL field**: This paper aims to provide a comprehensive survey on the use of large language models for the Text-to-SQL task. The discussion includes benchmark datasets, prompt engineering techniques, fine-tuning methods, and future research directions. The authors hope that this review will help readers better understand the latest advancements in this field and provide some insights for future research. In summary, the goal of this paper is to enhance the technology for converting natural language to SQL queries by investigating the application of large language models in the Text-to-SQL task, thereby improving non-expert users' access to databases.

A Survey on Employing Large Language Models for Text-to-SQL Tasks

Large Language Model Enhanced Text-to-SQL Generation: A Survey

Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges

A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?

Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation

SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended)

Querying Large Language Models with SQL

Large Language Model for Table Processing: A Survey

A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions

Enhancing Text-to-SQL Capabilities of Large Language Models via Domain Database Knowledge Injection

Analyzing the Effectiveness of Large Language Models on Text-to-SQL Synthesis

SA-SQL: A Schema-Aligned Framework for Text-to-SQL Through Large Language Models

A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers

Large Language Models Meet NLP: A Survey

SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL

Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey

Evaluating SQL Understanding in Large Language Models

Deep Learning Driven Natural Languages Text to SQL Query Conversion: A Survey