Abstract:This paper studies using foundational large language models (LLMs) to make decisions during hyperparameter optimization (HPO). Empirical evaluations demonstrate that in settings with constrained search budgets, LLMs can perform comparably or better than traditional HPO methods like random search and Bayesian optimization on standard benchmarks. Furthermore, we propose to treat the code specifying our model as a hyperparameter, which the LLM outputs, going beyond the capabilities of existing HPO approaches. Our findings suggest that LLMs are a promising tool for improving efficiency in the traditional decision-making problem of hyperparameter optimization.

What problem does this paper attempt to address?

The paper primarily explores how to utilize large language models (LLMs) for hyperparameter optimization (HPO). Specifically, the paper attempts to address the following key issues: 1. **Using LLMs for Hyperparameter Optimization**: The study investigates how to prompt LLMs to recommend hyperparameter settings and evaluate the effectiveness of these settings given a search budget. 2. **Improving the Efficiency of Traditional HPO Methods**: The paper points out that traditional HPO methods (such as random search, Bayesian optimization, etc.) have some limitations, such as reliance on manually designed search spaces and poor performance in the initial search phase. Therefore, the researchers explore whether LLMs can serve as a more efficient tool to improve these issues. 3. **Extending the Capabilities of HPO**: In addition to traditional hyperparameter configurations, the study proposes a more flexible approach—allowing LLMs to generate training code (e.g., code written in PyTorch), thereby automatically adjusting the model structure and other related parameters. 4. **Evaluating Performance in Different Scenarios**: The paper not only evaluates the performance of LLMs on standard benchmark datasets but also tests their performance in low-dimensional optimization problems and code generation tasks to verify their generality and flexibility. In summary, the core objective of this paper is to explore the potential of LLMs as a tool for hyperparameter optimization and to evaluate their effectiveness in different application scenarios, particularly whether they can surpass or match traditional methods under limited search budgets. Additionally, the study discusses how LLMs can further extend to more flexible hyperparameter configuration methods, such as automatically generating training code.

Using Large Language Models for Hyperparameter Optimization

Sequential Large Language Model-Based Hyper-Parameter Optimization

Large Language Model Agent for Hyper-Parameter Optimization

Large Language Models to Enhance Bayesian Optimization

In-the-loop Hyper-Parameter Optimization for LLM-Based Automated Design of Heuristics

Exploring the True Potential: Evaluating the Black-box Optimization Capability of Large Language Models

An investigation on the use of Large Language Models for hyperparameter tuning in Evolutionary Algorithms

When large language model meets optimization

Crafting Efficient Fine-Tuning Strategies for Large Language Models

Metaheuristics and Large Language Models Join Forces: Towards an Integrated Optimization Approach

Hyperparameter Optimization for Large Language Model Instruction-Tuning

Optimization Hyper-parameter Laws for Large Language Models

OptLLM: Optimal Assignment of Queries to Large Language Models

LLM-Select: Feature Selection with Large Language Models

Solving General Natural-Language-Description Optimization Problems with Large Language Models

Towards Optimizing with Large Language Models

Position: Leverage Foundational Models for Black-Box Optimization

On Speeding Up Language Model Evaluation

Hyperbolic Fine-tuning for Large Language Models

Large Language Model-Based Evolutionary Optimizer: Reasoning with elitism