SymBa: Symbolic Backward Chaining for Structured Natural Language Reasoning

Jinu Lee,Wonseok Hwang
2024-10-19
Abstract:To improve the performance and explainability of LLM-based natural language reasoning, structured reasoning can be applied to generate explicitly structured proofs. Among different methods for structured reasoning, we specifically focus on backward chaining, where the proof goal is recursively decomposed to subgoals by searching and applying rules. We argue that current LLM-based backward chaining systems (e.g. Least-to-most prompting and LAMBADA) are incomplete, as they omit crucial algorithmic components identified from the classic backward chaining algorithm (SLD Resolution) in computational logic. To this end, we propose a novel backward chaining system, SymBa (Symbolic Backward Chaining), which integrates a symbolic solver and an LLM. In SymBa, the solver controls the proof process, and the LLM is only called when the solver requires new information to complete the proof. Empowered by completeness, SymBa achieves a significant improvement in deductive, relational, and arithmetic reasoning benchmarks compared to the baselines.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the performance and interpretability of natural language reasoning based on large language models (LLMs). Specifically, the paper points out that current LLM - based backward - chaining reasoning systems (such as Least - to - most prompting and LAMBADA) are incomplete, that is, they omit key algorithmic components identified from classical backward - chaining algorithms (such as SLD resolution in computational logic). This causes these systems to have inaccurate reasoning paths and be unable to perform relational reasoning and arithmetic reasoning when dealing with complex problems. To solve these problems, the authors propose a new backward - chaining system - SymBa (Symbolic Backward Chaining), which combines a symbolic solver and a large language model. In SymBa, the symbolic solver controls the proof process, and the LLM is only invoked when the solver needs new information to complete the proof. Through this novel solver - LLM integration, SymBa not only benefits from the completeness of SLD resolution but also utilizes the natural language reasoning ability of the LLM. ### Main Contributions 1. **Analysis of Incompleteness**: The authors point out the incompleteness of these systems by comparing the algorithmic components of existing LLM - based backward - chaining systems with SLD resolution. 2. **Proposing SymBa**: A new backward - chaining system SymBa is proposed, which is controlled by a symbolic solver and invokes the LLM when needed. 3. **Empirical Results**: Through experiments on multiple benchmarks, it is shown that SymBa is superior to baseline methods in terms of answer accuracy, proof accuracy, and efficiency. ### Method #### Baseline Methods - **Least - to - most prompting**: This is a two - stage task decomposition method, including the decomposition and solution phases. In the decomposition phase, this method decomposes the problem into sub - problems and sorts them by complexity. In the solution phase, sub - problems are answered according to the problem and previous sub - problem - answer pairs. However, this method does not support backtracking and cannot be corrected even if the decomposition is inaccurate. - **LAMBADA**: This is a modular backward - chaining method that operates on pure natural language. Given a target, it tests all facts and rules to find applicable rules or facts. If relevant facts are found, the recursion stops; if a matching rule is found, it is decomposed into sub - goals. Although LAMBADA implements backtracking, it fails to handle binding propagation correctly, especially across sub - goals. #### Proposed Method - **Symbolic Backward Chaining**: SymBa directly integrates the SLD resolution solver and the LLM. Initially, the solver cannot prove the provided goal because its symbolic database is empty. To make progress, the solver invokes the LLM to check whether there are rules or facts in the natural language description that can be unified with the failed goal. When the LLM generates a unified statement, the solver re - attempts to prove the failed goal. This process continues until the top - level goal is proven or all possible reasoning paths have failed. ### Experimental Setup - **Benchmarks**: Multiple benchmarks including deductive reasoning, relational reasoning, and arithmetic reasoning. - **Solver**: A Python solver based on SLD resolution is developed, with extended functions for negation handling and arithmetic operations. - **Single - step Statement Generation**: Three state - of - the - art LLMs (GPT - 4 Turbo, Claude 3 Sonnet, LLaMa 3 70B Instruct) are used to implement the baseline methods and SymBa. ### Results - **Answer Accuracy**: SymBa shows the strongest performance in multiple reasoning types (deductive, relational, arithmetic) and with different LLMs. - **Proof Accuracy**: The proofs generated by SymBa are the most accurate, while Least - to - most and LAMBADA decline significantly in specific tasks. - **Efficiency**: SymBa is significantly superior to LAMBADA in terms of Token usage, API cost, and execution time, and is more efficient than Least - to - most on the ProofWriter benchmark. ### Analysis - **Solver Ablation Study**: By disabling backtracking and binding propagation, these are verified.