Batch Bayesian Optimization for Replicable Experimental Design

Zhongxiang Dai,Quoc Phong Nguyen,Sebastian Shenghong Tay,Daisuke Urano,Richalynn Leong,Bryan Kian Hsiang Low,Patrick Jaillet
2023-11-02
Abstract:Many real-world experimental design problems (a) evaluate multiple experimental conditions in parallel and (b) replicate each condition multiple times due to large and heteroscedastic observation noise. Given a fixed total budget, this naturally induces a trade-off between evaluating more unique conditions while replicating each of them fewer times vs. evaluating fewer unique conditions and replicating each more times. Moreover, in these problems, practitioners may be risk-averse and hence prefer an input with both good average performance and small variability. To tackle both challenges, we propose the Batch Thompson Sampling for Replicable Experimental Design (BTS-RED) framework, which encompasses three algorithms. Our BTS-RED-Known and BTS-RED-Unknown algorithms, for, respectively, known and unknown noise variance, choose the number of replications adaptively rather than deterministically such that an input with a larger noise variance is replicated more times. As a result, despite the noise heteroscedasticity, both algorithms enjoy a theoretical guarantee and are asymptotically no-regret. Our Mean-Var-BTS-RED algorithm aims at risk-averse optimization and is also asymptotically no-regret. We also show the effectiveness of our algorithms in two practical real-world applications: precision agriculture and AutoML.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to solve two main problems: 1. **Trade - off between input selection and number of repetitions in experimental design**: In many practical experimental design problems, due to the existence of large heteroscedastic observational noise, it is often necessary to evaluate multiple experimental conditions in parallel, and each condition needs to be repeated multiple times to improve performance. Given a fixed total budget, this naturally leads to a trade - off problem: whether to evaluate more unique experimental conditions and reduce the number of repetitions for each condition in each iteration, or to evaluate fewer unique experimental conditions and increase the number of repetitions for each condition. Moreover, due to the heteroscedasticity of experimental noise, how to effectively select experimental conditions and the number of repetitions under a limited budget to maximize the overall performance of the experiment is an important challenge. 2. **Risk - averse optimization**: In some experimental design problems, practitioners may hope to find an input condition with not only good average performance but also small variability. This means that in addition to maximizing the mean of the objective function, it is also necessary to minimize the variance of the noise. Therefore, how to optimize in the context of risk - averse is also an important research direction. To solve these two problems, the author proposes the "Batch Thompson Sampling for Replicable Experimental Design (BTS - RED)" framework, which includes three algorithms: - **BTS - RED - Known** and **BTS - RED - Unknown**: Applicable to cases where the noise variance is known and unknown respectively. These two algorithms adaptively select the number of repetitions for each input, so that inputs with larger noise variances are repeated more times, thus still being able to guarantee theoretical no - regret in the case of heteroscedastic noise. - **Mean - Var - BTS - RED**: Aims to solve the risk - averse optimization problem by maximizing the mean - variance objective function to find input conditions with both high average performance and small variability. This algorithm also has asymptotic no - regret. The paper also verifies the effectiveness of these algorithms through two practical application cases - precision agriculture and Automated Machine Learning (AutoML).