Online model adaptation in Monte Carlo tree search planning

Maddalena Zuccotto,Edoardo Fusa,Alberto Castellini,Alessandro Farinelli
DOI: https://doi.org/10.1007/s11081-024-09896-2
IF: 2.619
2024-06-19
Optimization and Engineering
Abstract:We propose a model-based reinforcement learning method using Monte Carlo Tree Search planning. The approach assumes a black-box approximated model of the environment developed by an expert using any kind of modeling framework and it improves the model as new information from the environment is collected. This is crucial in real-world applications, since having a complete knowledge of complex environments is impractical. The expert's model is first translated into a neural network and then it is updated periodically using data, i.e., state-action-next-state triplets, collected from the real environment. We propose three different methods to integrate data acquired from the environment with prior knowledge provided by the expert and we evaluate our approach on a domain concerning air quality and thermal comfort control in smart buildings. We compare the three proposed versions with standard Monte Carlo Tree Search planning using the expert's model (without adaptation), Proximal Policy Optimization (a popular model-free DRL approach) and Stochastic Lower Bounds Optimization (a popular model-based DRL approach). Results show that our approach achieves the best results, outperforming all analyzed competitors.
engineering, multidisciplinary,operations research & management science,mathematics, interdisciplinary applications
What problem does this paper attempt to address?