Error Detection for Text-to-SQL Semantic Parsing

Shijie Chen,Ziru Chen,Huan Sun,Yu Su
2023-12-06
Abstract:Despite remarkable progress in text-to-SQL semantic parsing in recent years, the performance of existing parsers is still far from perfect. Specifically, modern text-to-SQL parsers based on deep learning are often over-confident, thus casting doubt on their trustworthiness when deployed for real use. In this paper, we propose a parser-independent error detection model for text-to-SQL semantic parsing. Using a language model of code as its bedrock, we enhance our error detection model with graph neural networks that learn structural features of both natural language questions and SQL queries. We train our model on realistic parsing errors collected from a cross-domain setting, which leads to stronger generalization ability. Experiments with three strong text-to-SQL parsers featuring different decoding mechanisms show that our approach outperforms parser-dependent uncertainty metrics. Our model could also effectively improve the performance and usability of text-to-SQL semantic parsers regardless of their architectures. (Our implementation is available at <a class="link-external link-https" href="https://github.com/OSU-NLP-Group/Text2SQL-Error-Detection" rel="external noopener nofollow">this https URL</a>)
Computation and Language
What problem does this paper attempt to address?
The paper attempts to address the problem of error detection in text-to-SQL semantic parsing, particularly semantic errors in executable SQL queries. Despite significant progress in this field in recent years, the performance of existing parsers is still not perfect. In practical applications, these deep learning-based parsers are often overly confident, leading to doubts about their reliability. Therefore, this paper proposes a parser-agnostic error detection model aimed at improving the reliability and practicality of text-to-SQL semantic parsing. Specifically, the main contributions of the paper include: 1. Proposing the first general and parser-agnostic error detection model that is effective across multiple tasks and different parser designs without any task-specific adjustments. Experiments show that the proposed error detection model outperforms parser-dependent uncertainty metrics and maintains high performance in cross-parser evaluation settings. 2. This paper presents the first comprehensive study on error detection in text-to-SQL parsing, evaluating the performance of error detection methods on both correct and incorrect SQL predictions. It also demonstrates through simulated interactions that a more accurate error detector can significantly improve the efficiency and practicality of interactive text-to-SQL parsing systems. The paper constructs an error detection model based on a code language model and combines it with a graph neural network to capture the structural features of natural language questions and SQL queries, thereby improving the model's performance and generalization ability. Additionally, the model performs well on various tasks, including error detection, reranking, and interaction triggering.