ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao,Jeffrey Zhao,Dian Yu,Nan Du,Izhak Shafran,Karthik Narasimhan,Yuan Cao
DOI: https://doi.org/10.48550/arXiv.2210.03629
2023-03-10
Abstract:While large language models (LLMs) have demonstrated impressive capabilities across tasks in language understanding and interactive decision making, their abilities for reasoning (e.g. chain-of-thought prompting) and acting (e.g. action plan generation) have primarily been studied as separate topics. In this paper, we explore the use of LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, allowing for greater synergy between the two: reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with external sources, such as knowledge bases or environments, to gather additional information. We apply our approach, named ReAct, to a diverse set of language and decision making tasks and demonstrate its effectiveness over state-of-the-art baselines, as well as improved human interpretability and trustworthiness over methods without reasoning or acting components. Concretely, on question answering (HotpotQA) and fact verification (Fever), ReAct overcomes issues of hallucination and error propagation prevalent in chain-of-thought reasoning by interacting with a simple Wikipedia API, and generates human-like task-solving trajectories that are more interpretable than baselines without reasoning traces. On two interactive decision making benchmarks (ALFWorld and WebShop), ReAct outperforms imitation and reinforcement learning methods by an absolute success rate of 34% and 10% respectively, while being prompted with only one or two in-context examples. Project site with code: <a class="link-external link-https" href="https://react-lm.github.io" rel="external noopener nofollow">this https URL</a>
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to combine reasoning and action in large - scale language models to improve the performance of the models in language understanding and interactive decision - making tasks. Specifically, the paper proposes a new method named ReAct, aiming to enable large - scale language models (LLMs) to generate reasoning trajectories and task - specific actions by establishing a closer synergy between reasoning and action. This method not only improves the performance of the model on multiple tasks but also enhances the interpretability and credibility of the model. ### Main research questions: 1. **How to effectively combine reasoning and action in large - scale language models**: Traditional large - scale language models usually study reasoning and action separately. This article explores how to perform reasoning and action simultaneously within the same framework to achieve the synergy between the two. 2. **How to enhance reasoning ability by obtaining information from the external environment**: The model interacts with external knowledge bases (such as the Wikipedia API) to obtain additional information, thereby overcoming the problems of hallucination and error propagation in chained reasoning. 3. **How to verify the effectiveness of ReAct in different tasks**: The paper conducts experiments on multiple benchmark datasets, including question answering (HotpotQA), fact verification (Fever), text - based games (ALFWorld), and web navigation (WebShop), to demonstrate the advantages of ReAct in different tasks. ### Specific objectives: - **Improve the accuracy of the model in question - answering and fact - verification tasks**: Through interaction with the Wikipedia API, ReAct can answer questions and verify facts more accurately, reducing hallucination and error propagation. - **Increase the success rate in interactive decision - making tasks**: In complex interactive environments such as ALFWorld and WebShop, ReAct significantly improves the success rate of tasks by generating reasonable reasoning trajectories and action plans. - **Enhance the interpretability and credibility of the model**: The reasoning trajectories and action plans generated by ReAct are more in line with the human thinking process, making the behavior of the model easier to understand and trust. ### Method innovation points: - **Combination of reasoning and action**: ReAct realizes the dynamic synergy between the two by alternately generating between reasoning and action, enabling the model to better handle complex tasks. - **Utilization of external knowledge sources**: By interacting with the external environment (such as Wikipedia), ReAct can obtain and utilize more information, improving the accuracy and reliability of reasoning. - **Multi - task generalization ability**: ReAct performs well on multiple tasks, demonstrating its wide applicability in different fields. ### Experimental results: - **Performance on HotpotQA and Fever**: ReAct is significantly superior to the baseline methods that only perform reasoning or only perform action on these tasks, and performs best when combining internal knowledge and external knowledge. - **Performance on ALFWorld and WebShop**: ReAct also performs well in these complex interactive tasks, and the success rate reaches or exceeds that of reinforcement learning and imitation learning methods. In conclusion, through the ReAct method, this paper successfully realizes the close combination of reasoning and action in large - scale language models, which not only improves the performance of the model on multiple tasks but also enhances the interpretability and credibility of the model.