Abstract:We introduce a method to improve the zero-shot reasoning abilities of large language models on general language understanding tasks. Specifically, we build an autonomous agent to instruct the reasoning process of large language models. We show this approach further unleashes the zero-shot reasoning abilities of large language models to more tasks. We study the performance of our method on a wide set of datasets spanning generation, classification, and reasoning. We show that our method generalizes to most tasks and obtains state-of-the-art zero-shot performance on 20 of the 29 datasets that we evaluate. For instance, our method boosts the performance of state-of-the-art large language models by a large margin, including Vicuna-13b (13.3%), Llama-2-70b-chat (23.2%), and GPT-3.5 Turbo (17.0%). Compared to zero-shot chain of thought, our improvement in reasoning is striking, with an average increase of 10.5%. With our method, Llama-2-70b-chat outperforms zero-shot GPT-3.5 Turbo by 10.2%.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to improve the zero - shot reasoning ability of large - scale language models in general - purpose language understanding tasks. Specifically, the author proposes a method to guide the reasoning process of large - scale language models by constructing an autonomous agent, thereby improving the reasoning performance of these models on multiple tasks without specific task examples. This method aims to generate a set of unique instructions for each task, which only need to be generated once for each task and then used to guide the reasoning process of different large - scale language models in all task instances. ### Background and Objectives of the Paper - **Background**: Although large - scale language models (LLMs) have made significant progress in many language understanding tasks, their zero - shot reasoning ability in general - purpose tasks still needs to be improved. - **Objective**: Improve the zero - shot reasoning ability of large - scale language models in general - purpose language understanding tasks by constructing an autonomous agent capable of generating task - specific instructions. ### Method Overview - **Autonomous Agent**: This agent is based on a larger language model and can generate a set of unique instructions for each task. These instructions are used to guide the reasoning process of smaller language models when processing task instances. - **Instruction Generation**: The agent generates high - quality task - specific instructions by accessing relevant knowledge on the network. These instructions only need to be generated once for each task and then are applied to all task instances. - **Reasoning Process**: Using the generated instructions, large - scale language models can decompose tasks more effectively and finally arrive at the correct answer through a series of intermediate steps. ### Experimental Results - **Dataset**: The paper was evaluated on 29 benchmark datasets, covering generation, classification, and reasoning tasks. - **Performance Improvement**: Zero - shot AgentInstruct achieved significant performance improvements on multiple datasets, especially on reasoning tasks, with an average improvement of 10.5%. - **Comparison Method**: Compared with traditional zero - shot and zero - shot chain - of - thought (CoT) methods, zero - shot AgentInstruct performs well on most datasets, especially on reasoning tasks. ### Main Contributions - **Improve Zero - shot Reasoning Ability**: By generating task - specific instructions, zero - shot AgentInstruct significantly improves the zero - shot reasoning ability of large - scale language models on multiple tasks. - **Cost - effectiveness**: This method not only improves performance but also has higher cost - effectiveness compared to directly using more powerful agent models. - **Generalization Ability**: The generated instructions have good transferability between different large - scale language models, indicating that this method has wide applicability. ### Conclusion The paper successfully improves the zero - shot reasoning ability of large - scale language models in general - purpose language understanding tasks by constructing an autonomous agent to generate task - specific instructions. This method not only achieves significant performance improvements on multiple datasets but also demonstrates cost - effectiveness and generalization ability in practical applications.

Agent Instructs Large Language Models to be General Zero-Shot Reasoners

Large Language Models are Zero-Shot Reasoners

Enhancing Zero-Shot Chain-of-Thought Reasoning in Large Language Models through Logic

Empirical Study of Zero-Shot NER with ChatGPT

Zero-Shot Chain-of-Thought Reasoning Guided by Evolutionary Algorithms in Large Language Models

Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning

VideoAgent: Long-form Video Understanding with Large Language Model as Agent

IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models

Large Language Models Still Face Challenges in Multi-Hop Reasoning with External Knowledge

Conceptual and Unbiased Reasoning in Language Models

Multitask Prompted Training Enables Zero-Shot Task Generalization

Large Language Models as Zero-Shot Conversational Recommenders

Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration

Large Language Models Are Also Good Prototypical Commonsense Reasoners

Zero-Shot Question Answering over Financial Documents using Large Language Models

Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

Teaching Smaller Language Models To Generalise To Unseen Compositional Questions

GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents

Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning

Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination

Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text