Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning

Junzhi Chen,Juhao Liang,Benyou Wang
2024-06-24
Abstract:The emergence of large language models (LLMs) has opened up unprecedented possibilities for automating complex tasks that are often comparable to human performance. Despite their capabilities, LLMs still encounter difficulties in completing tasks that require high levels of accuracy and complexity due to their inherent limitations in handling multifaceted problems single-handedly. This paper introduces `Smurfs', a cutting-edge multi-agent framework designed to revolutionize the application of LLMs. By seamlessly transforming a conventional LLM into a synergistic multi-agent ensemble, Smurfs can enhance the model's ability to solve complex tasks at no additional cost. This is achieved through innovative prompting strategies that allocate distinct roles within the model, thereby facilitating collaboration among specialized agents and forming an intelligent multi-agent system. Our empirical investigation on both open-ended task of StableToolBench and closed-ended task on HotpotQA showcases Smurfs' superior capability in intricate tool utilization scenarios. Notably, Smurfs outmatches all the baseline methods in both experiments, setting new state-of-the-art performance. Furthermore, through comprehensive ablation studies, we dissect the contribution of the core components of the multi-agent framework to its overall efficacy. This not only verifies the effectiveness of the framework, but also sets a route for future exploration of multi-agent LLM systems.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper aims to address the limitations faced by large language models (LLMs) when handling complex tasks that require high precision, adaptability, and comprehensive knowledge integration. Although LLMs have been able to automate many complex tasks comparable to human performance, they still struggle with handling multifaceted problems single-handedly. The paper proposes a multi-agent framework called "Smurfs," which enhances the model's ability to solve complex tasks by transforming traditional LLMs into a collaborative multi-agent ensemble. ### Specific Problems and Solutions 1. **Multi-Tool Planning Challenges**: - **Effective Solution Planning**: Existing methods like ReACT and DFSDT face challenges in effective solution planning when dealing with multi-tool planning. - **Adaptability to New Tools**: LLMs find it difficult to quickly adapt to new tools when solving problems using multiple tools. 2. **Limitations of Existing Methods**: - **ReACT**: Although it proposes a think-act-observe format, it still has limitations in multi-tool planning. - **DFSDT**: Despite performing well in multi-tool planning, it has issues such as unstable rollback mechanisms, context redundancy, and premature termination. 3. **Innovations of the Smurfs Framework**: - **Multi-Agent System (MAS)**: By dividing tasks and collaborating, each agent focuses on specific subtasks, reducing context redundancy and improving task execution accuracy and output quality. - **Improved Rollback Mechanism**: Introduces a rule-based rollback mechanism to ensure the correctness of depth-first search, enabling even less capable models to effectively use DFSDT for tool planning. - **Combination of Macro and Micro Planning**: Uses task decomposition for macro planning and DFSDT for solving each subtask, avoiding premature termination issues. ### Experimental Validation 1. **Open Task: StableToolBench**: - Evaluation metrics include pass rate and win rate. - Experimental results show that Smurfs achieved the best or near-best performance on multiple LLMs, particularly excelling on the untrained Mistral-7B. 2. **Closed Task: HotpotQA**: - Evaluation metric is the F1 score. - Smurfs, even without training, not only outperformed other untrained agents but also, in some cases, surpassed trained agents, demonstrating its strong generalization ability and efficiency. ### Contribution Summary 1. **Proposed a novel plug-and-play multi-agent system framework**. Experiments show that this method is not only effective but also more cost-efficient than existing tool planning methods. 2. **Revealed the effectiveness of the multi-agent system framework through ablation studies**, providing valuable insights for future research. ### Conclusion The Smurfs framework significantly enhances the performance of LLMs in multi-tool planning tasks through the collaborative work of a multi-agent system, addressing the limitations of existing methods and laying the foundation for future research in multi-agent systems.