Large Language Model Agent for Hyper-Parameter Optimization

Siyi Liu,Chen Gao,Yong Li
2024-02-06
Abstract:Hyperparameter optimization is critical in modern machine learning, requiring expert knowledge, numerous trials, and high computational and human resources. Despite the advancements in Automated Machine Learning (AutoML), challenges in terms of trial efficiency, setup complexity, and interoperability still persist. To address these issues, we introduce a novel paradigm leveraging Large Language Models (LLMs) to automate hyperparameter optimization across diverse machine learning tasks, which is named AgentHPO (short for LLM Agent-based Hyperparameter Optimization). Specifically, AgentHPO processes the task information autonomously, conducts experiments with specific hyperparameters (HPs), and iteratively optimizes them based on historical trials. This human-like optimization process largely reduces the number of required trials, simplifies the setup process, and enhances interpretability and user trust, compared to traditional AutoML methods. Extensive empirical experiments conducted on 12 representative machine-learning tasks indicate that AgentHPO not only matches but also often surpasses the best human trials in terms of performance while simultaneously providing explainable results. Further analysis sheds light on the strategies employed by the LLM in optimizing these tasks, highlighting its effectiveness and adaptability in various scenarios.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the problem of Hyperparameter Optimization (HPO) in machine learning. Specifically, the research has made improvements in the following aspects: 1. **Reducing the number of trials**: Traditional Automated Machine Learning (AutoML) methods, although effective, require a large number of trials to optimize hyperparameters, which consumes a significant amount of time and computational resources. The proposed method aims to reduce the number of required trials through autonomous agents. 2. **Simplifying the configuration process**: Existing AutoML methods, while widely applied across different domains and hardware, have a complex configuration process that is prone to misconfiguration, leading to inefficiency or poor performance. The new method simplifies this process through natural language input and helps define the optimal hyperparameter search space. 3. **Improving interpretability and trust**: Many existing AutoML methods lack transparency, making it difficult for users to understand how different hyperparameters affect the model and the logic behind specific configuration choices. This reduces user trust, especially in the absence of expert knowledge. The proposed method provides clear textual explanations to enhance user understanding and trust in hyperparameter selection. To address the above challenges, the authors propose a new framework called AgentHPO, which utilizes Large Language Models (LLMs)-driven autonomous agents for hyperparameter optimization. AgentHPO includes two specialized agents: Creator and Executor. The Creator agent is responsible for receiving detailed task-related information (such as dataset characteristics, model structure, and optimization objectives) and generating initial hyperparameter configurations; the Executor agent then performs model training based on the hyperparameters provided by the Creator, records experimental data, and analyzes the results. The Creator agent iteratively optimizes the hyperparameters based on the training history provided by the Executor agent. The key contributions of the paper include: - Introducing the use of LLM-based autonomous agents for the first time in the study of HPO problems. - Proposing a general framework consisting of two specialized agents, Creator and Executor, which work collaboratively to efficiently tune machine learning models. - Conducting extensive experimental validation on multiple representative machine learning tasks, demonstrating the practical feasibility and superiority of the method. In summary, the paper addresses the challenges faced by existing AutoML methods by introducing an efficient, easy-to-configure, and highly interpretable hyperparameter optimization method based on LLM-driven autonomous agents.