Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models

Sijia Chen,Baochun Li,Di Niu
2024-02-17
Abstract:The reasoning performance of Large Language Models (LLMs) on a wide range of problems critically relies on chain-of-thought prompting, which involves providing a few chain of thought demonstrations as exemplars in prompts. Recent work, e.g., Tree of Thoughts, has pointed out the importance of exploration and self-evaluation in reasoning step selection for complex problem solving. In this paper, we present Boosting of Thoughts (BoT), an automated prompting framework for problem solving with LLMs by iteratively exploring and self-evaluating many trees of thoughts in order to acquire an ensemble of trial-and-error reasoning experiences, which will serve as a new form of prompting to solve the complex problem. Starting from a simple prompt without requiring examples, BoT iteratively explores and evaluates a large collection of reasoning steps, and more importantly, uses error analysis obtained from the LLM on them to explicitly revise prompting, which in turn enhances reasoning step generation, until a final answer is attained. Our experiments with GPT-4 and Llama2 across extensive complex mathematical problems demonstrate that BoT consistently achieves higher or comparable problem-solving rates than other advanced prompting approaches.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to automatically improve the reasoning ability of large - language models (LLMs) in complex problem - solving tasks without the need for human annotation. Specifically, the paper proposes a new framework named "Boosting of Thoughts (BoT)", which aims to obtain a series of trial - and - error reasoning experiences by iteratively exploring and self - evaluating multiple thought trees. These experiences can be used as a new form of prompt to help LLMs solve complex problems. BoT starts with a simple prompt without examples, iteratively explores and evaluates a large number of reasoning steps, and explicitly revises the prompt by using the error analysis of these steps by LLMs, thereby gradually improving the quality of the generation of reasoning steps until the final answer is obtained. The core of BoT lies in its ability to learn from mistakes and improve, which is similar to the way humans solve problems, that is, by carefully analyzing mistakes to gain experience and gradually improve performance. This method not only improves the scalability of LLMs in various tasks, but also shows a higher problem - solving rate than other advanced prompting methods in experiments, especially in complex mathematical problems.