Time-Series-Informed Closed-loop Learning for Sequential Decision Making and Control

Sebastian Hirt,Lukas Theiner,Rolf Findeisen
2024-12-03
Abstract:Closed-loop performance of sequential decision making algorithms, such as model predictive control, depends strongly on the parameters of cost functions, models, and constraints. Bayesian optimization is a common approach to learning these parameters based on closed-loop experiments. However, traditional Bayesian optimization approaches treat the learning problem as a black box, ignoring valuable information and knowledge about the structure of the underlying problem, resulting in slow convergence and high experimental resource use. We propose a time-series-informed optimization framework that incorporates intermediate performance evaluations from early iterations of each experimental episode into the learning procedure. Additionally, probabilistic early stopping criteria are proposed to terminate unpromising experiments, significantly reducing experimental time. Simulation results show that our approach achieves baseline performance with approximately half the resources. Moreover, with the same resource budget, our approach outperforms the baseline in terms of final closed-loop performance, highlighting its efficiency in sequential decision making scenarios.
Systems and Control,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the strong dependence of closed - loop performance on cost functions, models, and constraint parameters in sequential decision - making and control. When learning these parameters, traditional Bayesian optimization methods ignore valuable information about the underlying problem structure, resulting in slow convergence and high use of experimental resources. To this end, the authors propose an optimization framework based on time - series information. This framework incorporates intermediate performance evaluations in the early iterations of each experiment into the learning process and proposes a probabilistic early - stopping criterion to terminate unpromising experiments, thereby significantly reducing experimental time. Simulation results show that this method can achieve baseline performance with approximately half of the resources, and with the same resource budget, the final closed - loop performance is better than the baseline method, highlighting its efficiency in sequential decision - making scenarios. Specifically, the main contributions of the paper include: 1. **Bayesian Optimization with Time - Series Information (TSI - BO)**: Align the fidelity dimension of the surrogate model with the time axis of the closed - loop experiment. 2. **Probabilistic Decision Criteria Based on Upper Confidence Bound (UCB) and Expected Improvement (EI)**: Used for early stopping of unpromising experiments. 3. **Convergence - Based Stopping Criterion**: Utilize the information of the closed - loop trajectory to decide whether to terminate the experiment. These methods jointly improve the convergence speed, resource efficiency, and closed - loop performance of multi - fidelity Bayesian optimization in closed - loop performance optimization.