AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation

Mengkang Hu,Pu Zhao,Can Xu,Qingfeng Sun,Jianguang Lou,Qingwei Lin,Ping Luo,Saravan Rajmohan,Dongmei Zhang
2024-08-02
Abstract:Large Language Model (LLM) based agents have garnered significant attention and are becoming increasingly popular. Furthermore, planning ability is a crucial component of an LLM-based agent, involving interaction with the environment and executing actions to complete a planning task, which generally entails achieving a desired goal from an initial state. This paper investigates enhancing the planning abilities of LLMs through instruction tuning, referred to as agent training. Recent studies have demonstrated that utilizing expert-level trajectory for instruction-tuning LLMs effectively enhances their planning capabilities. However, existing work primarily focuses on synthesizing trajectories from manually designed planning tasks and environments. The labor-intensive nature of creating these environments and tasks impedes the generation of sufficiently varied and extensive trajectories. To address this limitation, this paper explores the automated synthesis of diverse environments and a gradual range of planning tasks, from easy to difficult. We introduce a framework, AgentGen, that leverages LLMs first to generate environments and subsequently generate planning tasks conditioned on these environments. Specifically, to improve environmental diversity, we propose using an inspiration corpus composed of various domain-specific text segments as the context for synthesizing environments. Moreover, to increase the difficulty diversity of generated planning tasks, we propose a bidirectional evolution method, Bi-Evol, that evolves planning tasks from easier and harder directions to synthesize a task set with a smoother difficulty curve. The evaluation results derived from AgentBoard show that AgentGen greatly improves LLMs' planning ability, e.g., the AgentGen instruction-tuned Llama-3 8B surpasses GPT-3.5 in overall performance. Moreover, in certain tasks, it even outperforms GPT-4.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the shortcomings of large language model (LLM) agents in planning capabilities, particularly the challenges in generating diverse and large-scale environment and task data. Specifically, existing research mainly relies on manually designed environments and planning tasks, which is time-consuming and difficult to produce sufficiently diverse trajectory data for agent training. To solve this problem, the paper proposes an automated framework—AGENT GEN, which automatically generates diverse environments and progressively more difficult planning tasks by leveraging LLM. The main contributions of AGENT GEN include: 1. **Environment Generation**: Proposes a method to generate diverse environment specifications using an inspiration corpus, covering a wide range of scenarios. 2. **Task Generation**: Introduces a bidirectional evolution method (BI-EVOL) to evolve planning tasks from simple to complex, generating a task set with a smooth difficulty curve. 3. **Experimental Validation**: Constructs a dataset based on PDDL containing 592 environments, each with 20 tasks, and evaluates a series of LLM models on AgentBoard through instruction-tuning. Experimental results show that AGENT GEN significantly outperforms LLama3-8B on in-domain tasks and exceeds GPT-4 on certain tasks. Additionally, it demonstrates similar advantages on out-of-domain tasks. In summary, the paper improves the planning capabilities of LLM agents through automated environment and task generation methods and demonstrates the effectiveness and generalization ability of this approach.