Reinforcement Twinning: from digital twins to model-based reinforcement learning

Lorenzo Schena,Pedro Marques,Romain Poletti,Samuel Ahizi,Jan Van den Berghe,Miguel A. Mendez
2024-07-11
Abstract:Digital twins promise to revolutionize engineering by offering new avenues for optimization, control, and predictive maintenance. We propose a novel framework for simultaneously training the digital twin of an engineering system and an associated control agent. The twin's training combines adjoint-based data assimilation and system identification methods, while the control agent's training merges model-based optimal control with model-free reinforcement learning. The control agent evolves along two independent paths: one driven by model-based optimal control and the other by reinforcement learning. The digital twin serves as a virtual environment for confrontation and indirect interaction, functioning as an "expert demonstrator." The best policy is selected for real-world interaction and cloned to the other path if training stagnates. We call this framework Reinforcement Twinning (RT). The framework is tested on three diverse engineering systems and control tasks: (1) controlling a wind turbine under varying wind speeds, (2) trajectory control of flapping-wing micro air vehicles (FWMAVs) facing wind gusts, and (3) mitigating thermal loads in managing cryogenic storage tanks. These test cases use simplified models with known ground truth closure laws. Results show that the adjoint-based digital twin training is highly sample-efficient, completing within a few iterations. For the control agent training, both model-based and model-free approaches benefit from their complementary learning experiences. The promising results pave the way for implementing the RT framework on real systems.
Systems and Control
What problem does this paper attempt to address?
This paper proposes a new framework called Reinforcement Twinning (RT) aiming to simultaneously train the digital twin of an engineering system and its associated control agent. The digital twin combines adjoint-based data assimilation and system identification methods, while the training of the control agent combines model-based optimal control and model-free reinforcement learning. During the training process, the control agent evolves along two separate paths, one driven by the model and the other driven by reinforcement learning. The digital twin serves as a virtual environment for indirect interaction between the two. When independent training stagnates, the best strategy is selected to interact with the real environment and "cloned" onto the other path. The RT framework was tested on three different engineering systems and control tasks: wind turbine wind speed control, flapping micro aerial vehicle trajectory control, and low-temperature tank heat load mitigation. Simplified models were used for experimentation, and the results showed that the digital twin achieved high efficiency in adjoint training samples and could be accomplished within a few iterations. In the training of the control agent, model-driven and model-free control training mutually benefited from complementary learning methods. The key point of the paper lies in combining the concept of real-time interactive digital twin with reinforcement learning to improve control efficiency and adaptability, laying the foundation for the future implementation of the RT framework in real systems.