Abstract:Text-to-SQL, the task of translating natural language questions into SQL queries, is part of various business processes. Its automation, which is an emerging challenge, will empower software practitioners to seamlessly interact with relational databases using natural language, thereby bridging the gap between business needs and software capabilities. In this paper, we consider Large Language Models (LLMs), which have achieved state of the art for various NLP tasks. Specifically, we benchmark Text-to-SQL performance, the evaluation methodologies, as well as input optimization (e.g., prompting). In light of the empirical observations that we have made, we propose two novel metrics that were designed to adequately measure the similarity between SQL queries. Overall, we share with the community various findings, notably on how to select the right LLM on Text-to-SQL tasks. We further demonstrate that a tree-based edit distance constitutes a reliable metric for assessing the similarity between generated SQL queries and the oracle for benchmarking Text2SQL approaches. This metric is important as it relieves researchers from the need to perform computationally expensive experiments such as executing generated queries as done in prior works. Our work implements financial domain use cases and, therefore contributes to the advancement of Text2SQL systems and their practical adoption in this domain.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on the following aspects: 1. **Improving the accuracy of Text - to - SQL conversion**: The paper aims to improve the accuracy of converting natural language queries into SQL queries by evaluating the performance of different large - language models (LLMs) on Text - to - SQL tasks. This includes research on model selection, input optimization, and performance benchmarking. 2. **Developing new evaluation metrics**: Since directly executing the generated SQL queries to evaluate their correctness may require high computational costs, the paper proposes two new evaluation metrics, namely Tree Similarity of Editing Distance (TSED) and another metric not specifically named. These metrics can effectively evaluate the quality of the generated SQL queries without executing the queries. 3. **Optimizing the generated SQL queries**: The paper explores how to improve the performance of Text - to - SQL models by rephrasing questions, optimizing prompts, etc., so as to generate more accurate and more expected SQL queries. 4. **Filling the gap in financial - domain datasets**: The paper points out that existing Text - to - SQL datasets lack data for financial scenarios. Therefore, the researchers created a small dataset specifically for bank transactions, containing 30 questions of different difficulty levels, to challenge the state - of - the - art models and reveal the gap between practical applications and research evaluations. 5. **Exploring best practices for model selection**: The paper experimentally compares the performance of multiple LLMs on Text - to - SQL tasks and explores the influence of factors such as model size and the amount of training data on performance, providing guidance for selecting models suitable for Text - to - SQL tasks in the financial domain. In summary, through systematic research and experiments, this paper aims to enhance the application effect of Text - to - SQL technology in the financial field, especially the accuracy and efficiency in automated database querying.

Enhancing Text-to-SQL Translation for Financial System Design

FinSQL: Model-Agnostic LLMs-based Text-to-SQL Framework for Financial Analysis

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

Enhancing LLM Fine-tuning for Text-to-SQLs by SQL Quality Measurement

SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended)

SA-SQL: A Schema-Aligned Framework for Text-to-SQL Through Large Language Models

SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy

Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation

Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL

Evaluating the Data Model Robustness of Text-to-SQL Systems Based on Real User Queries

Large Language Model Enhanced Text-to-SQL Generation: A Survey

Recent Advances in Text-to-SQL: A Survey of What We Have and What We Expect

A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges

LI-EMRSQL: Linking Information Enhanced Text2SQL Parsing on Complex Electronic Medical Records

A Survey on Employing Large Language Models for Text-to-SQL Tasks

EPI-SQL: Enhancing Text-to-SQL Translation with Error-Prevention Instructions

Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs

Evaluating LLMs for Text-to-SQL Generation With Complex SQL Workload

Towards Text-to-SQL over Aggregate Tables

PET-SQL: A Prompt-Enhanced Two-Round Refinement of Text-to-SQL with Cross-consistency

ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL