Cost-aware Bayesian Optimization via the Pandora's Box Gittins Index

Qian Xie,Raul Astudillo,Peter I. Frazier,Ziv Scully,Alexander Terenin
2024-10-31
Abstract:Bayesian optimization is a technique for efficiently optimizing unknown functions in a black-box manner. To handle practical settings where gathering data requires use of finite resources, it is desirable to explicitly incorporate function evaluation costs into Bayesian optimization policies. To understand how to do so, we develop a previously-unexplored connection between cost-aware Bayesian optimization and the Pandora's Box problem, a decision problem from economics. The Pandora's Box problem admits a Bayesian-optimal solution based on an expression called the Gittins index, which can be reinterpreted as an acquisition function. We study the use of this acquisition function for cost-aware Bayesian optimization, and demonstrate empirically that it performs well, particularly in medium-high dimensions. We further show that this performance carries over to classical Bayesian optimization without explicit evaluation costs. Our work constitutes a first step towards integrating techniques from Gittins index theory into Bayesian optimization.
Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the key problems in **cost - aware Bayesian optimization**. Specifically, it focuses on how to effectively incorporate the cost of function evaluation in the Bayesian optimization process to more realistically reflect the resource limitations in practical application scenarios. #### Background and Motivation 1. **Limitations of Bayesian Optimization**: - Standard Bayesian optimization methods mainly focus on minimizing simple regret, that is, finding the global optimal solution. - However, in many practical applications, the cost of obtaining each sample cannot be ignored. For example, when using cloud services for hyper - parameter tuning, the time and computing resources for training neural networks will bring direct economic costs. 2. **Deficiencies of Existing Methods**: - Many existing cost - aware methods rely on multi - step lookahead computations, which are computationally complex and numerically unstable. - Some popular cost - aware acquisition functions (such as Expected Improvement per Cost Unit, EIPC) have been theoretically proven to perform poorly on certain problems. #### Research Objectives - **Introduce a New Theoretical Framework**: By relating cost - aware Bayesian optimization to the Pandora's Box problem in economics, propose a new acquisition function based on the Gittins index. - **Develop an Efficient and Robust Acquisition Function**: Design an acquisition function that is theoretically sound, computationally simple, can perform well on medium - to - high - dimensional problems, and is suitable for problems with different cost structures. - **Extend to Classical Bayesian Optimization**: Verify the effectiveness of the new method on classical (no explicit cost) Bayesian optimization problems. #### Main Contributions 1. **Establish the Connection between the Pandora's Box Problem and Cost - Aware Bayesian Optimization**: - Discover the mathematical connection between the two and use the Bayesian optimal solution of the Pandora's Box problem to derive a new class of acquisition functions. 2. **Propose the Pandora's Box Gittins Index (PBGI) Acquisition Function**: - PBGI is a new acquisition function that not only considers cost but also performs well on medium - to - high - dimensional problems. - PBGI can work effectively in both budget - constrained and per - sample - cost settings. 3. **Empirical Analysis**: - Verify the performance of PBGI through extensive experiments, especially outperforming existing baseline methods on medium - dimensional problems. - Find that in classical Bayesian optimization problems, even without explicit cost, PBGI still performs well. #### Formula Summary - **Definition of Gittins Index**: \[ \alpha^\star(x)=g\quad\text{where}\quad g\text{ satisfies}\quad\mathbb{E}[I_f(x; g)] = c(x) \] where \(I_f(x; g)\) represents the expected improvement with respect to \(g\). - **PBGI Acquisition Function**: \[ \alpha_{\text{PBGI}}^t(x)=g\quad\text{where}\quad g\text{ satisfies}\quad\mathbb{E}[I_{f|x_1:t, y_1:t}(x; g)]=\lambda c(x) \] Here \(\lambda\) is a hyperparameter used to adjust the budget constraint. Through these contributions, this paper provides a new perspective on cost - aware Bayesian optimization and shows its potential in practical applications.