QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback-based Self-Correction

Xiang Huang,Sitao Cheng,Shanshan Huang,Jiayu Shen,Yong Xu,Chaoyun Zhang,Yuzhong Qu
2024-06-13
Abstract:Employing Large Language Models (LLMs) for semantic parsing has achieved remarkable success. However, we find existing methods fall short in terms of reliability and efficiency when hallucinations are encountered. In this paper, we address these challenges with a framework called QueryAgent, which solves a question step-by-step and performs step-wise self-correction. We introduce an environmental feedback-based self-correction method called ERASER. Unlike traditional approaches, ERASER leverages rich environmental feedback in the intermediate steps to perform selective and differentiated self-correction only when necessary. Experimental results demonstrate that QueryAgent notably outperforms all previous few-shot methods using only one example on GrailQA and GraphQ by 7.0 and 15.0 F1. Moreover, our approach exhibits superiority in terms of efficiency, including runtime, query overhead, and API invocation costs. By leveraging ERASER, we further improve another baseline (i.e., AgentBench) by approximately 10 points, revealing the strong transferability of our approach.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the deficiencies in reliability and efficiency of the existing knowledge - base question - answering systems (KBQA) based on large language models (LLMs). Specifically: 1. **Reliability issues**: - Existing methods perform poorly when encountering hallucinations, that is, they generate inaccurate or unreasonable results. - These methods lack interpretability and are prone to error propagation, causing subsequent reasoning processes to be based on wrong foundations, thus affecting the accuracy of the results. 2. **Efficiency issues**: - Existing methods have problems of low efficiency in terms of running time, number of queries, and API call costs. For example, some methods need to execute thousands of SPARQL queries and take several minutes to obtain the final answer. - Some methods rely on beam search and self - consistency, which can improve the correct rate but also increase the running time and number of queries. To solve these problems, the author proposes a framework named QueryAgent, which improves reliability and efficiency by gradually constructing target queries and performing step - by - step self - correction. In addition, QueryAgent introduces a self - correction method based on environmental feedback, ERASER, which can detect and correct errors in intermediate steps, thus avoiding error accumulation. ### Main contributions - **Improved reliability**: Through step - by - step reasoning and self - correction mechanisms, QueryAgent can more reliably generate correct query results. - **Improved efficiency**: Compared with existing methods, QueryAgent significantly reduces running time and the number of queries and lowers API call costs. - **Innovative self - correction method**: ERASER utilizes rich environmental feedback to perform targeted self - correction when necessary, improving the overall performance of the system. ### Experimental results The experimental results show that QueryAgent significantly outperforms the existing few - shot methods on multiple datasets and improves the F1 score by 5.7 and 15.0 points on GrailQA and GraphQ datasets respectively. In addition, QueryAgent also performs well in terms of efficiency, especially in running time and API call costs. In conclusion, this paper aims to solve the challenges in reliability and efficiency of existing KBQA systems by introducing QueryAgent and ERASER, thus providing a more efficient and reliable solution.