Abstract:First Order Bayesian Optimization (FOBO) is a sample efficient sequential approach to find the global maxima of an expensive-to-evaluate black-box objective function by suitably querying for the function and its gradient evaluations. Such methods assume Gaussian process (GP) models for both, the function and its gradient, and use them to construct an acquisition function that identifies the next query point. In this paper, we propose a class of practical FOBO algorithms that efficiently utilizes the information from the gradient GP to identify potential query points with zero gradients. We construct a multi-level acquisition function where in the first step, we optimize a lower level acquisition function with multiple restarts to identify potential query points with zero gradient value. We then use the upper level acquisition function to rank these query points based on their function values to potentially identify the global maxima. As a final step, the potential point of maxima is chosen as the actual query point. We validate the performance of our proposed algorithms on several test functions and show that our algorithms outperform state-of-the-art FOBO algorithms. We also illustrate the application of our algorithms in finding optimal set of hyper-parameters in machine learning and in learning the optimal policy in reinforcement learning tasks.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to improve the efficiency and effectiveness of Bayesian Optimization (BO) in finding the global maximum of expensive black - box objective functions, especially by using first - order information (i.e., function values and their gradients) to improve the search process. Specifically, the paper proposes a class of practical First - Order Bayesian Optimization (FOBO) algorithms, aiming to more efficiently use the information of the Gradient Gaussian Process (GP) to identify potential query points and finally determine the global maximum. ### Problem Background Bayesian Optimization is a black - box optimization method for determining the global maximum of an unknown function, especially suitable for cases where the evaluation cost is high. Traditional Zeroth - Order Bayesian Optimization (ZOBO) only uses function values for optimization, while First - Order Bayesian Optimization (FOBO) uses both function values and gradient information simultaneously. Although FOBO has potential, existing methods have problems such as high computational complexity and failure to fully utilize gradient information. ### Main Contributions of the Paper 1. **Multi - level Acquisition Function**: The paper proposes a multi - level acquisition function. First, it identifies potential query points with zero gradients by optimizing the low - level acquisition function, and then ranks these points through the high - level acquisition function to determine the points that are most likely to be the global maximum. 2. **Efficient Gradient Utilization**: By assuming that each partial derivative is an independent Gaussian process, the paper reduces the computational complexity and more effectively utilizes the gradient information. 3. **Balance between Exploration and Exploitation**: Uncertainty estimation is introduced to encourage exploration, thereby searching for the optimal solution more comprehensively. 4. **Two Specific FOBO Algorithms**: Based on Expected Improvement (EI) and Probability of Improvement (PI), two specific FOBO algorithms are proposed, and their superiority is verified through experiments. ### Application Scenarios The paper shows the performance of the proposed FOBO algorithms on multiple test functions and applies them to hyper - parameter optimization in machine learning and optimal policy learning tasks in reinforcement learning, proving their effectiveness in practical applications. ### Summary By proposing a new multi - level acquisition function framework, the paper solves the problems of high computational complexity and insufficient utilization of gradient information in existing FOBO algorithms, thereby improving the efficiency and effectiveness of Bayesian Optimization in finding the global maximum.

Practical First-Order Bayesian Optimization Algorithms

A Comprehensive Pragmatic Investigation of Batched Acquisition Functions in Bayesian Optimization

Pseudo-Bayesian Optimization

Practical Batch Bayesian Optimization for Less Expensive Functions

Simulation Based Bayesian Optimization

Bayesian Optimization Based on K-Optimality

Poisson Process for Bayesian Optimization

A Bayesian Optimization Framework for Finding Local Optima in Expensive Multi-Modal Functions

A Bayesian Optimization Framework for Finding Local Optima in Expensive Multimodal Functions

Practical Bayesian Optimization of Machine Learning Algorithms

Bayesian Optimization Using Pseudo-Points

Scalable First-Order Bayesian Optimization via Structured Automatic Differentiation

Bayesian Optimization of Function Networks with Partial Evaluations

On a New Improvement-Based Acquisition Function for Bayesian Optimization

Bayesian Optimization under Stochastic Delayed Feedback

Voronoi Candidates for Bayesian Optimization

Function Optimization with Posterior Gaussian Derivative Process

Scalable Bayesian Optimization via Focalized Sparse Gaussian Processes

A Trust Region Based Local Bayesian Optimization Without Exhausted Optimization of Acquisition Function

Provably Efficient Bayesian Optimization with Unbiased Gaussian Process Hyperparameter Estimation

Bayesian Optimization for Non-Convex Two-Stage Stochastic Optimization Problems