AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

Zhiheng Xi,Yiwen Ding,Wenxiang Chen,Boyang Hong,Honglin Guo,Junzhe Wang,Dingwen Yang,Chenyang Liao,Xin Guo,Wei He,Songyang Gao,Lu Chen,Rui Zheng,Yicheng Zou,Tao Gui,Qi Zhang,Xipeng Qiu,Xuanjing Huang,Zuxuan Wu,Yu-Gang Jiang

2024-06-06

Abstract:Building generalist agents that can handle diverse tasks and evolve themselves across different environments is a long-term goal in the AI community. Large language models (LLMs) are considered a promising foundation to build such agents due to their generalized capabilities. Current approaches either have LLM-based agents imitate expert-provided trajectories step-by-step, requiring human supervision, which is hard to scale and limits environmental exploration; or they let agents explore and learn in isolated environments, resulting in specialist agents with limited generalization. In this paper, we take the first step towards building generally-capable LLM-based agents with self-evolution ability. We identify a trinity of ingredients: 1) diverse environments for agent exploration and learning, 2) a trajectory set to equip agents with basic capabilities and prior knowledge, and 3) an effective and scalable evolution method. We propose AgentGym, a new framework featuring a variety of environments and tasks for broad, real-time, uni-format, and concurrent agent exploration. AgentGym also includes a database with expanded instructions, a benchmark suite, and high-quality trajectories across environments. Next, we propose a novel method, AgentEvol, to investigate the potential of agent self-evolution beyond previously seen data across tasks and environments. Experimental results show that the evolved agents can achieve results comparable to SOTA models. We release the AgentGym suite, including the platform, dataset, benchmark, checkpoints, and algorithm implementations. The AgentGym suite is available on <a class="link-external link-https" href="https://github.com/WooooDyy/AgentGym" rel="external noopener nofollow">this https URL</a>.

Artificial Intelligence,Computation and Language

What problem does this paper attempt to address?

This paper aims to address the problem of building a general intelligent agent that can handle diverse tasks and self-evolve in different environments. Current approaches either rely on human supervision, where large language models (LLMs) gradually mimic expert-provided trajectories, which is difficult to scale and limits environmental exploration, or let the agent learn in isolated environments, resulting in expert agents that perform well only on specific tasks. The paper proposes a new framework called AGENT GYM, which includes multiple environments and tasks to support real-time, unified format, and concurrent agent exploration. In addition, they propose a method called AGENT EVOL to study how agents can self-evolve based on environmental feedback, surpassing previous task and environmental data. Experimental results show that the evolved agents can achieve comparable results to state-of-the-art models. The contributions of the paper include the AGENT GYM framework, a database, benchmark test suite, high-quality trajectories, and the AGENT EVOL algorithm. Overall, this work is a first step towards building a general LLM intelligent agent with self-evolution capabilities.

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms

TrainerAgent: Customizable and Efficient Model Training Through LLM-Powered Multi-Agent System.

AgentBench: Evaluating LLMs as Agents

A Survey on Self-Evolution of Large Language Models

AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation

The Rise and Potential of Large Language Model Based Agents: A Survey

Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization

ProAgent: Building Proactive Cooperative Agents with Large Language Models

Large Language Model based Multi-Agents: A Survey of Progress and Challenges

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models

Training Agents with Weakly Supervised Feedback from Large Language Models

EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents

Agents: An Open-source Framework for Autonomous Language Agents

xLAM: A Family of Large Action Models to Empower AI Agent Systems

MegaAgent: A Practical Framework for Autonomous Cooperation in Large-Scale LLM Agent Systems

Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation

Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement

A Survey on Game Playing Agents and Large Models: Methods, Applications, and Challenges