A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning

Xinzhe Li
2024-10-24
Abstract:Tool use, planning, and feedback learning are currently three prominent paradigms for developing Large Language Model (LLM)-based agents across various tasks. Although numerous frameworks have been devised for each paradigm, their intricate workflows and inconsistent taxonomy create challenges in understanding and reviewing the frameworks across different paradigms. This survey introduces a unified taxonomy to systematically review and discuss these frameworks. Specifically, 1) the taxonomy defines environments/tasks, common LLM-profiled roles or LMPRs (policy models, evaluators, and dynamic models), and universally applicable workflows found in prior work, and 2) it enables a comparison of key perspectives on the implementations of LMPRs and workflow designs across different agent paradigms and frameworks. 3) Finally, we identify three limitations in existing workflow designs and systematically discuss the future work. Resources have been made publicly available at in our GitHub repository <a class="link-external link-https" href="https://github.com/xinzhel/LLM-Agent-Survey" rel="external noopener nofollow">this https URL</a>.
Artificial Intelligence,Computation and Language,Software Engineering
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve The paper aims to address the challenges in tool usage, planning, and feedback learning, which are the three main paradigms for large language model (LLM) agents in various tasks. Specifically: 1. **Complex Workflows and Inconsistent Classification**: - Currently, multiple frameworks have been designed for each paradigm, but the complex workflows and inconsistent classification methods make it difficult to understand and evaluate the frameworks across different paradigms. - The paper proposes a unified taxonomy to systematically review and discuss these frameworks. 2. **Lack of a Unified Basis for Comparison**: - Existing reviews often focus on specific paradigms or domains, lacking a coherent and unified starting point to discuss these three paradigms. - For example, some studies focus on frameworks in the planning paradigm, while others concentrate on LLM agents in gaming environments. This fragmented perspective makes cross-paradigm comparison difficult. - The paper extracts common workflows and proposes a task-agnostic process to facilitate comparison across different paradigms and frameworks. 3. **Limitations of Existing Workflow Designs**: - The paper identifies three limitations in existing workflow designs and systematically discusses future research directions. - These limitations include: a unified solution for foundational workflows and autonomous tool usage workflows, the lack of a general tool usage workflow design, and the formalization problem of translating language tasks into agent tasks. ### Main Contributions 1. **Defining the Scope of Applicable Environments**: - The paper summarizes the environments in which these agents operate, dividing them into decision-making environments and natural language interaction environments. 2. **Summarizing Common LLM Roles and Workflows**: - The paper explores workflows based on complex algorithms in the planning and feedback learning paradigms and attempts to go beyond these complex workflows to explore common workflows applicable to all types of environments. - Additionally, the paper explores common workflows for tool usage, based on three common LLM roles: policy model, evaluator, and dynamic model. 3. **Emphasizing Key Perspectives**: - The paper compares the implementation of LLM roles in different workflow types, emphasizing workflows in different paradigms and proposing alternative viewpoints to prevent confusion. - The paper also critically discusses three limitations of current frameworks and systematically points out potential workflow design directions for future research. Through these contributions, the paper provides a systematic framework and methodology for understanding and developing LLM agents.