Agent Instructs Large Language Models to be General Zero-Shot Reasoners

Nicholas Crispino,Kyle Montgomery,Fankun Zeng,Dawn Song,Chenguang Wang
2024-08-15
Abstract:We introduce a method to improve the zero-shot reasoning abilities of large language models on general language understanding tasks. Specifically, we build an autonomous agent to instruct the reasoning process of large language models. We show this approach further unleashes the zero-shot reasoning abilities of large language models to more tasks. We study the performance of our method on a wide set of datasets spanning generation, classification, and reasoning. We show that our method generalizes to most tasks and obtains state-of-the-art zero-shot performance on 20 of the 29 datasets that we evaluate. For instance, our method boosts the performance of state-of-the-art large language models by a large margin, including Vicuna-13b (13.3%), Llama-2-70b-chat (23.2%), and GPT-3.5 Turbo (17.0%). Compared to zero-shot chain of thought, our improvement in reasoning is striking, with an average increase of 10.5%. With our method, Llama-2-70b-chat outperforms zero-shot GPT-3.5 Turbo by 10.2%.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the zero - shot reasoning ability of large - scale language models in general - purpose language understanding tasks. Specifically, the author proposes a method to guide the reasoning process of large - scale language models by constructing an autonomous agent, thereby improving the reasoning performance of these models on multiple tasks without specific task examples. This method aims to generate a set of unique instructions for each task, which only need to be generated once for each task and then used to guide the reasoning process of different large - scale language models in all task instances. ### Background and Objectives of the Paper - **Background**: Although large - scale language models (LLMs) have made significant progress in many language understanding tasks, their zero - shot reasoning ability in general - purpose tasks still needs to be improved. - **Objective**: Improve the zero - shot reasoning ability of large - scale language models in general - purpose language understanding tasks by constructing an autonomous agent capable of generating task - specific instructions. ### Method Overview - **Autonomous Agent**: This agent is based on a larger language model and can generate a set of unique instructions for each task. These instructions are used to guide the reasoning process of smaller language models when processing task instances. - **Instruction Generation**: The agent generates high - quality task - specific instructions by accessing relevant knowledge on the network. These instructions only need to be generated once for each task and then are applied to all task instances. - **Reasoning Process**: Using the generated instructions, large - scale language models can decompose tasks more effectively and finally arrive at the correct answer through a series of intermediate steps. ### Experimental Results - **Dataset**: The paper was evaluated on 29 benchmark datasets, covering generation, classification, and reasoning tasks. - **Performance Improvement**: Zero - shot AgentInstruct achieved significant performance improvements on multiple datasets, especially on reasoning tasks, with an average improvement of 10.5%. - **Comparison Method**: Compared with traditional zero - shot and zero - shot chain - of - thought (CoT) methods, zero - shot AgentInstruct performs well on most datasets, especially on reasoning tasks. ### Main Contributions - **Improve Zero - shot Reasoning Ability**: By generating task - specific instructions, zero - shot AgentInstruct significantly improves the zero - shot reasoning ability of large - scale language models on multiple tasks. - **Cost - effectiveness**: This method not only improves performance but also has higher cost - effectiveness compared to directly using more powerful agent models. - **Generalization Ability**: The generated instructions have good transferability between different large - scale language models, indicating that this method has wide applicability. ### Conclusion The paper successfully improves the zero - shot reasoning ability of large - scale language models in general - purpose language understanding tasks by constructing an autonomous agent to generate task - specific instructions. This method not only achieves significant performance improvements on multiple datasets but also demonstrates cost - effectiveness and generalization ability in practical applications.