MURRE: Multi-Hop Table Retrieval with Removal for Open-Domain Text-to-SQL

Xuanliang Zhang,Dingzirui Wang,Longxu Dou,Qingfu Zhu,Wanxiang Che
2024-09-18
Abstract:The open-domain text-to-SQL task aims to retrieve question-relevant tables from massive databases and generate SQL. However, the performance of current methods is constrained by single-hop retrieval, and existing multi-hop retrieval of open-domain question answering is not directly applicable due to the tendency to retrieve tables similar to the retrieved ones but irrelevant to the question. Since the questions in text-to-SQL usually contain all required information, while previous multi-hop retrieval supplements the questions with retrieved documents. Therefore, we propose the multi-hop table retrieval with removal (MURRE), which removes previously retrieved information from the question to guide the retriever towards unretrieved relevant tables. Our experiments on two open-domain text-to-SQL datasets demonstrate an average improvement of 5.7% over the previous state-of-the-art results.
Computation and Language
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the multi - hop table retrieval problem in the open - domain text - to - SQL conversion task. Specifically, existing methods mainly rely on single - hop retrieval when performing table retrieval, which limits their performance. Moreover, the existing multi - hop retrieval methods in open - domain question answering cannot be directly applied to the open - domain text - to - SQL task because these methods tend to retrieve tables that are similar to the retrieved tables but irrelevant to the question. Since open - domain text - to - SQL problems usually contain all necessary information, and previous multi - hop retrieval methods expand the question by supplementing the retrieved documents, this results in the retrieved tables may be irrelevant to the question. To solve these problems, the authors propose **MURRE** (Multi - Hop Table Retrieval with Removal), which is a multi - hop table retrieval method based on removal. MURRE improves the retrieval accuracy by removing the retrieved information from the question and guiding the retriever to retrieve the unretrieved relevant tables. ### Main contributions 1. **Analyze the inapplicability of multi - hop retrieval in open - domain question answering**: - Discuss why the multi - hop retrieval methods in open - domain question answering cannot be directly applied to the open - domain text - to - SQL task. 2. **Propose the MURRE method**: - Ensure that the retrieved tables are relevant to the user question rather than just similar to the retrieved tables by removing the retrieved information at each hop. 3. **Experimental verification**: - Experiments were carried out on two datasets (SpiderUnion and BirdUnion), and the results show that MURRE improves the performance by an average of 5.7% compared to the previous state - of - the - art methods. ### Method overview The main steps of MURRE include: 1. **Retrieval**: - Retrieve the tables relevant to the question by calculating the correlation probability between the tables and the question. 2. **Removal**: - Use a language model (LLM) to remove the retrieved information from the question and represent the unretrieved information in tabular form to guide the subsequent retrieval. 3. **Scoring**: - Score the tables according to the correlation probability between the tables and the question, and select the most relevant table as the input for generating SQL. ### Experimental results - **Performance improvement**: - MURRE significantly improves the recall rate and the full recall rate on the SpiderUnion and BirdUnion datasets, with an average improvement of 5.7%. - **Effectiveness of multi - hop retrieval**: - Experiments show that multi - hop retrieval can effectively retrieve tables relevant to the question, especially when multiple tables are required. - **Importance of the removal mechanism**: - The removal mechanism significantly reduces the situation of retrieving tables that are irrelevant to the question but similar to the retrieved tables, and improves the retrieval accuracy. ### Conclusion MURRE solves the challenges of multi - hop table retrieval in the open - domain text - to - SQL task by introducing the removal mechanism, and significantly improves the retrieval performance. This method provides an effective solution for the open - domain text - to - SQL task.