Deductive Verification of Chain-of-Thought Reasoning

Zhan Ling,Yunhao Fang,Xuanlin Li,Zhiao Huang,Mingu Lee,Roland Memisevic,Hao Su

2023-10-04

Abstract:Large Language Models (LLMs) significantly benefit from Chain-of-Thought (CoT) prompting in performing various reasoning tasks. While CoT allows models to produce more comprehensive reasoning processes, its emphasis on intermediate reasoning steps can inadvertently introduce hallucinations and accumulated errors, thereby limiting models' ability to solve complex reasoning tasks. Inspired by how humans engage in careful and meticulous deductive logical reasoning processes to solve tasks, we seek to enable language models to perform explicit and rigorous deductive reasoning, and also ensure the trustworthiness of their reasoning process through self-verification. However, directly verifying the validity of an entire deductive reasoning process is challenging, even with advanced models like ChatGPT. In light of this, we propose to decompose a reasoning verification process into a series of step-by-step subprocesses, each only receiving their necessary context and premises. To facilitate this procedure, we propose Natural Program, a natural language-based deductive reasoning format. Our approach enables models to generate precise reasoning steps where subsequent steps are more rigorously grounded on prior steps. It also empowers language models to carry out reasoning self-verification in a step-by-step manner. By integrating this verification process into each deductive reasoning stage, we significantly enhance the rigor and trustfulness of generated reasoning steps. Along this process, we also improve the answer correctness on complex reasoning tasks. Code will be released at <a class="link-external link-https" href="https://github.com/lz1oceani/verify_cot" rel="external noopener nofollow">this https URL</a>.

Computation and Language,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that when large - language models (LLMs) perform various reasoning tasks, although they can generate more comprehensive reasoning processes through Chain - of - Thought (CoT) prompts, over - emphasizing intermediate reasoning steps may inadvertently introduce hallucinations and cumulative errors, thus limiting the model's ability to solve complex reasoning tasks. Specifically, the paper focuses on how to enable language models to perform explicit and rigorous deductive reasoning and ensure the credibility of their reasoning processes through self - verification. However, directly verifying the effectiveness of the entire deductive reasoning process is very challenging, even for advanced models such as ChatGPT. To solve this problem, the author proposes a method of decomposing the reasoning verification process into a series of step - by - step sub - processes, and each sub - process only receives the necessary context and preconditions. To promote this process, they propose "Natural Program", which is a deductive reasoning format based on natural language. In this way, the model can generate precise reasoning steps, and subsequent steps are more strictly based on previous steps. At the same time, this method also enables the language model to perform reasoning self - verification in a step - by - step manner. By integrating this verification process at each deductive reasoning stage, the rigor and credibility of the generated reasoning steps are significantly improved, and the accuracy rate of answers to complex reasoning tasks is also improved.

Deductive Verification of Chain-of-Thought Reasoning

Concise and Organized Perception Facilitates Large Language Models for Deductive Reasoning.

Large Language Models Are Better Reasoners with Self-Verification

Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification

General Purpose Verification for Chain of Thought Prompting

Large Language Models are reasoners with Self-Verification

GraphReason: Enhancing Reasoning Capabilities of Large Language Models through A Graph-Based Verification Approach

ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models

CoT Rerailer: Enhancing the Reliability of Large Language Models in Complex Reasoning Tasks through Error Detection and Correction

ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting

Multimodal Chain-of-Thought Reasoning in Language Models

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Strategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitation

Are LLMs Rigorous Logical Reasoner? Empowering Natural Language Proof Generation with Contrastive Stepwise Decoding

The Impact of Reasoning Step Length on Large Language Models

Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework

RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought

Towards understanding chain-of-thought prompting: An empirical study of what matters

On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks