Abstract:Training AI models has always been challenging, especially when there is a need for custom models to provide personalized services. Algorithm engineers often face a lengthy process to iteratively develop models tailored to specific business requirements, making it even more difficult for non-experts. The quest for high-quality and efficient model development, along with the emergence of Large Language Model (LLM) Agents, has become a key focus in the industry. Leveraging the powerful analytical, planning, and decision-making capabilities of LLM, we propose a TrainerAgent system comprising a multi-agent framework including Task, Data, Model and Server agents. These agents analyze user-defined tasks, input data, and requirements (e.g., accuracy, speed), optimizing them comprehensively from both data and model perspectives to obtain satisfactory models, and finally deploy these models as online service. Experimental evaluations on classical discriminative and generative tasks in computer vision and natural language processing domains demonstrate that our system consistently produces models that meet the desired criteria. Furthermore, the system exhibits the ability to critically identify and reject unattainable tasks, such as fantastical scenarios or unethical requests, ensuring robustness and safety. This research presents a significant advancement in achieving desired models with increased efficiency and quality as compared to traditional model development, facilitated by the integration of LLM-powered analysis, decision-making, and execution capabilities, as well as the collaboration among four agents. We anticipate that our work will contribute to the advancement of research on TrainerAgent in both academic and industry communities, potentially establishing it as a new paradigm for model development in the field of AI.

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

AgentTuning: Enabling Generalized Agent Abilities for LLMs

CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models

TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems

TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage

TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents

MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories

Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning

MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification

VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents

TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use

MMedAgent: Learning to Use Medical Tools with Multi-modal Agent

Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

TrainerAgent: Customizable and Efficient Model Training Through LLM-Powered Multi-Agent System.

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance

A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping