Abstract:Grammar-based parsers have achieved high performance in the cross-domain text-to-SQL parsing task, but suffer from low decoding efficiency due to the much larger number of actions for grammar selection than that of tokens in SQL queries. Meanwhile, how to better align SQL clauses and question segments has been a key challenge for parsing performance. Therefore, this paper proposes clause-level parallel decoding and alignment loss to enhance two high-performance grammar-based parsers, i.e., RATSQL and LGESQL. Experimental results of two parsers show that our method obtains consistent improvements both in accuracy and decoding speed.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on two aspects: 1. **Low decoding efficiency**: Grammar - based parsers have achieved high performance in cross - domain text - to - SQL parsing tasks. However, because the number of actions for selecting grammar is much larger than the number of tokens in SQL queries, it leads to low decoding efficiency. This makes the decoding process very time - consuming, especially in practical applications where a quick response is required. 2. **Poor alignment between SQL clauses and question fragments**: How to better align SQL clauses and question fragments is a key challenge in improving parsing performance. Existing methods often have difficulty effectively capturing the alignment relationship between SQL clauses and question fragments when dealing with complex SQL structures. To solve the above problems, the paper proposes two strategies, namely **clause - level parallel decoding** and **alignment loss**, to enhance two high - performance grammar - based parsers, RATSQL and LGESQL. Experimental results show that these methods have achieved significant improvements in both accuracy and decoding speed. ### Specific methods 1. **Clause - level parallel decoding**: - By generating SQL clauses in parallel instead of sequentially, the decoding efficiency can be significantly improved. Each clause is generated independently and no longer depends on the state of the previous clause. - This method takes advantage of the loose association between the generation of different clauses, thereby increasing the decoding speed. 2. **Alignment loss**: - A new training loss, namely alignment loss, is introduced to encourage the model to pay attention to relevant input question fragments when generating clauses. - Through alignment loss, the model can more accurately capture the alignment relationship between SQL clauses and question fragments, thereby improving the accuracy of parsing. ### Experimental results - **Improvement in accuracy**: The experimental results on the Spider dataset show that after using these two strategies, the accuracy of RATSQL and LGESQL has increased by 0.6% and 0.2% respectively. - **Improvement in decoding speed**: The decoding speed has increased by 18.9% and 35.5% respectively. Especially for LGESQL, because its grammar is simpler and the action sequence is shorter, the improvement is more obvious. ### Conclusion The clause - level parallel decoding and alignment loss methods proposed in the paper effectively improve the efficiency and accuracy of grammar - based text - to - SQL parsing models. These improvements are of great significance in handling complex and cross - domain SQL query tasks.

Faster and Better Grammar-based Text-to-SQL Parsing via Clause-level Parallel Decoding and Alignment Loss

Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning

Grammar-based Neural Text-to-SQL Generation

Graph Alignment for Cross-Domain Text-to-SQL

SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration

Clause-Wise and Recursive Decoding for Complex and Cross-Domain Text-to-SQL Generation

CatSQL: Towards Real World Natural Language to SQL Applications.

Error Detection for Text-to-SQL Semantic Parsing

SA-SQL: A Schema-Aligned Framework for Text-to-SQL Through Large Language Models

"What Do You Mean by That?" A Parser-Independent Interactive Approach for Enhancing Text-to-SQL.

Text-to-SQL Error Correction with Language Models of Code

Schema-Aware Multi-Task Learning for Complex Text-to-SQL

MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing

RGISQL: Integrating Refined Grammatical Information into Relational Graph Neural Network for Text-to-SQL Task

CQR-SQL: Conversational Question Reformulation Enhanced Context-Dependent Text-to-SQL Parsers

A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions

RECPARSER: A Recursive Semantic Parsing Framework for Text-to-SQL Task

MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL

A Hybrid Semantic Parsing Approach for Tabular Data Analysis

PET-SQL: A Prompt-Enhanced Two-Round Refinement of Text-to-SQL with Cross-consistency

SPSQL: Step-by-step Parsing Based Framework for Text-to-SQL Generation