Abstract:Hyperparameter optimization is critical in modern machine learning, requiring expert knowledge, numerous trials, and high computational and human resources. Despite the advancements in Automated Machine Learning (AutoML), challenges in terms of trial efficiency, setup complexity, and interoperability still persist. To address these issues, we introduce a novel paradigm leveraging Large Language Models (LLMs) to automate hyperparameter optimization across diverse machine learning tasks, which is named AgentHPO (short for LLM Agent-based Hyperparameter Optimization). Specifically, AgentHPO processes the task information autonomously, conducts experiments with specific hyperparameters (HPs), and iteratively optimizes them based on historical trials. This human-like optimization process largely reduces the number of required trials, simplifies the setup process, and enhances interpretability and user trust, compared to traditional AutoML methods. Extensive empirical experiments conducted on 12 representative machine-learning tasks indicate that AgentHPO not only matches but also often surpasses the best human trials in terms of performance while simultaneously providing explainable results. Further analysis sheds light on the strategies employed by the LLM in optimizing these tasks, highlighting its effectiveness and adaptability in various scenarios.

What problem does this paper attempt to address?

The paper aims to address the problem of Hyperparameter Optimization (HPO) in machine learning. Specifically, the research has made improvements in the following aspects: 1. **Reducing the number of trials**: Traditional Automated Machine Learning (AutoML) methods, although effective, require a large number of trials to optimize hyperparameters, which consumes a significant amount of time and computational resources. The proposed method aims to reduce the number of required trials through autonomous agents. 2. **Simplifying the configuration process**: Existing AutoML methods, while widely applied across different domains and hardware, have a complex configuration process that is prone to misconfiguration, leading to inefficiency or poor performance. The new method simplifies this process through natural language input and helps define the optimal hyperparameter search space. 3. **Improving interpretability and trust**: Many existing AutoML methods lack transparency, making it difficult for users to understand how different hyperparameters affect the model and the logic behind specific configuration choices. This reduces user trust, especially in the absence of expert knowledge. The proposed method provides clear textual explanations to enhance user understanding and trust in hyperparameter selection. To address the above challenges, the authors propose a new framework called AgentHPO, which utilizes Large Language Models (LLMs)-driven autonomous agents for hyperparameter optimization. AgentHPO includes two specialized agents: Creator and Executor. The Creator agent is responsible for receiving detailed task-related information (such as dataset characteristics, model structure, and optimization objectives) and generating initial hyperparameter configurations; the Executor agent then performs model training based on the hyperparameters provided by the Creator, records experimental data, and analyzes the results. The Creator agent iteratively optimizes the hyperparameters based on the training history provided by the Executor agent. The key contributions of the paper include: - Introducing the use of LLM-based autonomous agents for the first time in the study of HPO problems. - Proposing a general framework consisting of two specialized agents, Creator and Executor, which work collaboratively to efficiently tune machine learning models. - Conducting extensive experimental validation on multiple representative machine learning tasks, demonstrating the practical feasibility and superiority of the method. In summary, the paper addresses the challenges faced by existing AutoML methods by introducing an efficient, easy-to-configure, and highly interpretable hyperparameter optimization method based on LLM-driven autonomous agents.

Large Language Model Agent for Hyper-Parameter Optimization

Using Large Language Models for Hyperparameter Optimization

In-the-loop Hyper-Parameter Optimization for LLM-Based Automated Design of Heuristics

Sequential Large Language Model-Based Hyper-Parameter Optimization

Hyperparameter optimization: Classics, acceleration, online, multi-objective, and tools

Training Language Model Agents without Modifying Language Models

Automatic Hyper-Parameter Optimization Based on Mapping Discovery from Data to Hyper-Parameters

Optimization Hyper-parameter Laws for Large Language Models

Large Language Models to Enhance Bayesian Optimization

Solving General Natural-Language-Description Optimization Problems with Large Language Models

Offline Training of Language Model Agents with Functions as Learnable Weights

AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML

Autonomous Multi-Objective Optimization Using Large Language Model

An investigation on the use of Large Language Models for hyperparameter tuning in Evolutionary Algorithms

Meta Hyperparameter Optimization with Adversarial Proxy Subsets Sampling

Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System

AgentTuning: Enabling Generalized Agent Abilities for LLMs

LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination

LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch

Large Language Models for Human-Machine Collaborative Particle Accelerator Tuning through Natural Language

HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model