Bayesian Optimization for Non-Convex Two-Stage Stochastic Optimization Problems

Jack M. Buckingham,Ivo Couckuyt,Juergen Branke
2024-08-31
Abstract:Bayesian optimization is a sample-efficient method for solving expensive, black-box optimization problems. Stochastic programming concerns optimization under uncertainty where, typically, average performance is the quantity of interest. In the first stage of a two-stage problem, here-and-now decisions must be made in the face of this uncertainty, while in the second stage, wait-and-see decisions are made after the uncertainty has been resolved. Many methods in stochastic programming assume that the objective is cheap to evaluate and linear or convex. In this work, we apply Bayesian optimization to solve non-convex, two-stage stochastic programs which are expensive to evaluate. We formulate a knowledge-gradient-based acquisition function to jointly optimize the first- and second-stage variables, establish a guarantee of asymptotic consistency and provide a computationally efficient approximation. We demonstrate comparable empirical results to an alternative we formulate which alternates its focus between the two variable types, and superior empirical results over the standard, naive, two-step benchmark. We show that differences in the dimension and length scales between the variable types can lead to inefficiencies of the two-step algorithm, while the joint and alternating acquisition functions perform well in all problems tested. Experiments are conducted on both synthetic and real-world examples.
Machine Learning,Optimization and Control
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the non - convex two - stage stochastic optimization problems. The characteristic of this type of problems is that there is uncertainty in the decision - making process, and optimal decisions need to be made in the context of uncertainty. Specifically, this problem can be divided into two stages: 1. **First stage**: Before the uncertainty is resolved, "here - and - now decisions" must be made. These decisions are usually referred to as fixed design variables \( \mathbf{x} \). 2. **Second stage**: After the uncertainty is observed, "wait - and - see decisions" are made. These decisions usually depend on the uncertainty variable \( \mathbf{u} \), denoted as \( \mathbf{y} = g(\mathbf{u}) \). The objective is to maximize the expected value \( E_{\mathbf{u}}[h(\mathbf{x}, \mathbf{y}, \mathbf{u})] \) of a certain function \( h(\mathbf{x}, \mathbf{y}, \mathbf{u}) \), where \( \mathbf{u} \) is a random variable representing the uncertainty in the system. ### Specific problem description - **Non - convexity**: Different from traditional linear or convex optimization problems, this paper focuses on non - convex problems, which makes the problems more complex and difficult to solve. - **Expensive evaluation**: The evaluation of the objective function \( h(\mathbf{x}, \mathbf{y}, \mathbf{u}) \) is very expensive. It may be necessary to run time - consuming fluid mechanics or finite element analysis simulations, or conduct physical experiments. - **Joint optimization**: Traditional two - stage optimization methods are usually carried out step by step, optimizing \( \mathbf{y} \) first and then \( \mathbf{x} \). However, this method is inefficient and may not converge to the global optimal solution. Therefore, this paper proposes a method based on Bayesian optimization to jointly optimize \( \mathbf{x} \) and \( \mathbf{y} \). ### Solutions In order to efficiently solve the above problems, the paper proposes the following methods: 1. **Knowledge gradient acquisition function**: The author constructs an acquisition function based on the Knowledge Gradient (KG) for jointly optimizing the variables in the first stage and the second stage. The knowledge gradient is an acquisition function that can handle partial information and observation noise, and is suitable for multi - task and multi - fidelity optimization problems. 2. **Asymptotic consistency**: The paper establishes the asymptotic consistency guarantee of the recommended solution, that is, as the number of samples increases, the value of the recommended solution will converge to the true maximum. 3. **Computationally efficient approximation method**: In order to calculate the knowledge gradient efficiently, the author proposes a method based on discretization and Quasi - Monte Carlo (qMC) approximation, avoiding the high computational cost of nested optimization. 4. **Alternating strategy**: In addition to the joint optimization strategy, the paper also proposes an alternating optimization strategy. This strategy alternately optimizes \( \mathbf{x} \) and \( \mathbf{y} \) in each iteration, and its performance is demonstrated in the experiment. ### Application scenarios Non - convex two - stage stochastic optimization problems have a wide range of applications in operations research and engineering fields, such as the design of wind farms (Chen et al., 2022), power plant management (Phan & Ghosh, 2014), and control co - design in optimal control, which involves simultaneously optimizing the fixed plant design and control signals in the feedback loop. Through these methods, the paper aims to provide an efficient and accurate solution to deal with complex optimization problems in practical applications.