Deductive Verification of Chain-of-Thought Reasoning

Zhan Ling,Yunhao Fang,Xuanlin Li,Zhiao Huang,Mingu Lee,Roland Memisevic,Hao Su
2023-10-04
Abstract:Large Language Models (LLMs) significantly benefit from Chain-of-Thought (CoT) prompting in performing various reasoning tasks. While CoT allows models to produce more comprehensive reasoning processes, its emphasis on intermediate reasoning steps can inadvertently introduce hallucinations and accumulated errors, thereby limiting models' ability to solve complex reasoning tasks. Inspired by how humans engage in careful and meticulous deductive logical reasoning processes to solve tasks, we seek to enable language models to perform explicit and rigorous deductive reasoning, and also ensure the trustworthiness of their reasoning process through self-verification. However, directly verifying the validity of an entire deductive reasoning process is challenging, even with advanced models like ChatGPT. In light of this, we propose to decompose a reasoning verification process into a series of step-by-step subprocesses, each only receiving their necessary context and premises. To facilitate this procedure, we propose Natural Program, a natural language-based deductive reasoning format. Our approach enables models to generate precise reasoning steps where subsequent steps are more rigorously grounded on prior steps. It also empowers language models to carry out reasoning self-verification in a step-by-step manner. By integrating this verification process into each deductive reasoning stage, we significantly enhance the rigor and trustfulness of generated reasoning steps. Along this process, we also improve the answer correctness on complex reasoning tasks. Code will be released at <a class="link-external link-https" href="https://github.com/lz1oceani/verify_cot" rel="external noopener nofollow">this https URL</a>.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that when large - language models (LLMs) perform various reasoning tasks, although they can generate more comprehensive reasoning processes through Chain - of - Thought (CoT) prompts, over - emphasizing intermediate reasoning steps may inadvertently introduce hallucinations and cumulative errors, thus limiting the model's ability to solve complex reasoning tasks. Specifically, the paper focuses on how to enable language models to perform explicit and rigorous deductive reasoning and ensure the credibility of their reasoning processes through self - verification. However, directly verifying the effectiveness of the entire deductive reasoning process is very challenging, even for advanced models such as ChatGPT. To solve this problem, the author proposes a method of decomposing the reasoning verification process into a series of step - by - step sub - processes, and each sub - process only receives the necessary context and preconditions. To promote this process, they propose "Natural Program", which is a deductive reasoning format based on natural language. In this way, the model can generate precise reasoning steps, and subsequent steps are more strictly based on previous steps. At the same time, this method also enables the language model to perform reasoning self - verification in a step - by - step manner. By integrating this verification process at each deductive reasoning stage, the rigor and credibility of the generated reasoning steps are significantly improved, and the accuracy rate of answers to complex reasoning tasks is also improved.