Sequential Large Language Model-Based Hyper-Parameter Optimization

Kanan Mahammadli

2024-10-27

Abstract:This study introduces SLLMBO, an innovative framework that leverages Large Language Models (LLMs) for hyperparameter optimization (HPO), incorporating dynamic search space adaptability, enhanced parameter landscape exploitation, and a hybrid, novel LLM-Tree-structured Parzen Estimator (LLM-TPE) sampler. By addressing limitations in recent fully LLM-based methods and traditional Bayesian Optimization (BO), SLLMBO achieves more robust optimization. This comprehensive benchmarking evaluates multiple LLMs, including GPT-3.5-turbo, GPT-4o, Claude-Sonnet-3.5, and Gemini-1.5-flash, extending prior work beyond GPT-3.5 and GPT-4 and establishing SLLMBO as the first framework to benchmark a diverse set of LLMs for HPO. By integrating LLMs' established strengths in parameter initialization with the exploitation abilities demonstrated in this study, alongside TPE's exploration capabilities, the LLM-TPE sampler achieves a balanced exploration-exploitation trade-off, reduces API costs, and mitigates premature early stoppings for more effective parameter searches. Across 14 tabular tasks in classification and regression, the LLM-TPE sampler outperformed fully LLM-based methods and achieved superior results over BO methods in 9 tasks. Testing early stopping in budget-constrained scenarios further demonstrated competitive performance, indicating that LLM-based methods generally benefit from extended iterations for optimal results. This work lays the foundation for future research exploring open-source LLMs, reproducibility of LLM results in HPO, and benchmarking SLLMBO on complex datasets, such as image classification, segmentation, and machine translation.

Machine Learning,Artificial Intelligence,Computation and Language

What problem does this paper attempt to address?

The problem this paper attempts to address is the shortcomings of existing hyperparameter optimization (HPO) methods in terms of automation, search space adaptability, and exploration-exploitation balance. Specifically: 1. **Limitations of Traditional Bayesian Optimization (BO)**: - Requires human experts to define the parameters to be optimized and their possible value ranges. - The search space remains fixed throughout the optimization process and cannot be dynamically adjusted. - The optimization process usually starts with random initial parameters, which is inefficient and computationally expensive. - For each new task, BO needs to optimize from scratch. 2. **Limitations of Current Large Language Model (LLM)-Based Methods**: - Mainly use OpenAI's models without extensive evaluation of other LLMs' performance. - Limited by input token constraints, allowing only a limited number of iterations. - Lack of a dynamic search space adaptation mechanism, which may lead to premature convergence or missing the optimal solution. - Lack of autonomous management capabilities in terms of exploration-exploitation balance. To address these issues, the paper proposes an innovative framework—**SLLMBO** (Sequential Large Language Model-Based Optimization), which combines the advantages of LLMs and the Tree-structured Parzen Estimator (TPE) sampling method to achieve dynamic search space adaptation, enhanced parameter landscape utilization, and balanced exploration-exploitation strategies. With these improvements, SLLMBO aims to achieve more robust hyperparameter optimization and perform excellently in multiple benchmark tests.

Sequential Large Language Model-Based Hyper-Parameter Optimization

Using Large Language Models for Hyperparameter Optimization

Large Language Models to Enhance Bayesian Optimization

Large Language Model Agent for Hyper-Parameter Optimization

In-the-loop Hyper-Parameter Optimization for LLM-Based Automated Design of Heuristics

LM4OPT: Unveiling the potential of Large Language Models in formulating mathematical optimization problems

LLMs are Highly-Constrained Biophysical Sequence Optimizers

Exploring the True Potential: Evaluating the Black-box Optimization Capability of Large Language Models

OptLLM: Optimal Assignment of Queries to Large Language Models

Hyperparameter Optimization for Large Language Model Instruction-Tuning

LLM+P: Empowering Large Language Models with Optimal Planning Proficiency

Black-Box Prompt Optimization: Aligning Large Language Models without Model Training

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments

Solving General Natural-Language-Description Optimization Problems with Large Language Models

Full Parameter Fine-tuning for Large Language Models with Limited Resources

LLM Interactive Optimization of Open Source Python Libraries -- Case Studies and Generalization

LLM as a Complementary Optimizer to Gradient Descent: A Case Study in Prompt Tuning

Using Large Language Models for Parametric Shape Optimization

L3Ms -- Lagrange Large Language Models

MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs