Jose Pablo Folch,James Odgers,Shiqiang Zhang,Robert M Lee,Behrang Shafei,David Walz,Calvin Tsay,Mark van der Wilk,Ruth Misener
Abstract:There has been a surge in interest in data-driven experimental design with applications to chemical engineering and drug manufacturing. Bayesian optimization (BO) has proven to be adaptable to such cases, since we can model the reactions of interest as expensive black-box functions. Sometimes, the cost of this black-box functions can be separated into two parts: (a) the cost of the experiment itself, and (b) the cost of changing the input parameters. In this short paper, we extend the SnAKe algorithm to deal with both types of costs simultaneously. We further propose extensions to the case of a maximum allowable input change, as well as to the multi-objective setting.
What problem does this paper attempt to address?
This paper attempts to solve the following three main problems:
1. **Self - stopping Experiments**:
- In practical applications, we may not be able to know in advance how many experiments are required, or we may hope to minimize the number of experiments to save materials. The SnAKe algorithm requires the total number of experiments to be determined in advance, which is impractical in practice. Therefore, this paper proposes a self - terminating SnAKe algorithm (ssSnAKe). By introducing the Expected Improvement per unit cost (EIpu) and the predicted variance as termination conditions, the algorithm can automatically terminate when a sufficiently good optimization result is achieved.
2. **Maximum Allowable Input Changes**:
- In some application scenarios, the changes between input parameters may be restricted, such as the rate of temperature change or the maximum temperature limit. Although the SnAKe algorithm can minimize the cumulative cost of input changes, the individual input change in each iteration is unbounded, which may lead to unsafe or infeasible situations. For this reason, this paper proposes the Truncated SnAKe algorithm (TrSnAKe). By limiting the maximum value of each input change, this problem is solved.
3. **Multi - objective Optimization**:
- In many practical applications, experimenters usually face multi - objective optimization problems rather than single - objective optimization. The SnAKe algorithm assumes that there is only one objective function, while in multi - objective optimization, we need to find a Pareto front. This paper proposes a multi - objective SnAKe algorithm (MO - SnAKe). By using the random scalarization method to combine multiple objective functions into a single objective function, multi - objective optimization is achieved.
### Formula Summary
- **Experimental Cost Function**:
\[
C(x_t, x_{t + 1})=C^{(0)}(x_{t + 1})+C^{(\Delta)}(x_t, x_{t + 1})
\]
where \(C^{(0)}(x_{t + 1})\) is the fixed cost of each experiment, and \(C^{(\Delta)}(x_t, x_{t + 1})\) is the cost of input changes.
- **Maximum Allowable Input Change Constraint**:
\[
|x_{t + 1}-x_t|<\delta_{\text{MAX}}, \quad \forall t\in\{1, \ldots, T\}
\]
- **Scalarization Function in Multi - objective Optimization**:
- Linear Scalarization:
\[
S(\{f^{(k)}\}_{k = 1}^K, \lambda)(x)=\sum_{k = 1}^K\lambda_k f^{(k)}(x)
\]
- Chebyshev Scalarization:
\[
S(\{f^{(k)}\}_{k = 1}^K, \lambda)(x)=\min_{k = 1,\ldots,K}\lambda_k(f^{(k)}(x)-z^{(k)})
\]
where \(z^{(k)}\) is the reference point.
Through these improvements, the paper aims to make the SnAKe algorithm more suitable for practical experimental design scenarios and improve its performance under different constraint conditions.