Online Model-Free Reinforcement Learning for the Automatic Control of a Flexible Wing Aircraft

Mohammed Abouheaf,Wail Gueaieb,Frank Lewis
DOI: https://doi.org/10.1049/iet-cta.2018.6163
2021-08-05
Abstract:The control problem of the flexible wing aircraft is challenging due to the prevailing and high nonlinear deformations in the flexible wing system. This urged for new control mechanisms that are robust to the real-time variations in the wing's aerodynamics. An online control mechanism based on a value iteration reinforcement learning process is developed for flexible wing aerial structures. It employs a model-free control policy framework and a guaranteed convergent adaptive learning architecture to solve the system's Bellman optimality equation. A Riccati equation is derived and shown to be equivalent to solving the underlying Bellman equation. The online reinforcement learning solution is implemented using means of an adaptive-critic mechanism. The controller is proven to be asymptotically stable in the Lyapunov sense. It is assessed through computer simulations and its superior performance is demonstrated on two scenarios under different operating conditions.
Systems and Control,Artificial Intelligence,Machine Learning,Robotics
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to address the challenges in the automatic control of Flexible Wing Aircraft. Specifically, due to the highly nonlinear deformation characteristics of the flexible wing system, traditional modeling and controller design become very difficult. Therefore, the paper proposes an online model - free adaptive control mechanism based on reinforcement learning to deal with the dynamic changes encountered by flexible - wing drones during real - time flight. #### Main problem description 1. **Highly nonlinear deformation**: The wings of the flexible - wing drone will undergo complex nonlinear deformations during flight, which makes traditional control methods based on accurate models difficult to apply. 2. **Real - time variable dynamic characteristics**: The dynamic characteristics of the flexible - wing system change over time, and a control strategy that can adapt to these changes in real - time is required. 3. **Lack of accurate model**: For the complex and highly nonlinear flexible - wing system, it is difficult to obtain an accurate mathematical model. Therefore, a control method that does not rely on an accurate model needs to be developed. #### Solution overview To solve the above problems, the paper proposes a reinforcement - learning - based value - iteration method, which has the following characteristics: - **Model - free control**: It does not require an accurate system model and realizes the control of the flexible - wing drone through online learning. - **Adaptive learning architecture**: An adaptive learning mechanism that guarantees convergence is adopted to solve the Bellman optimal equation of the system. - **Stability and robustness**: The asymptotic stability of the controller is proved by Lyapunov stability theory, and its superior performance under different operating conditions is verified by simulation. ### Key technical points - **Bellman optimal equation**: Solve the Bellman optimal equation of the system through the value - iteration process to ensure the optimality of the control strategy. - **Riccati equation**: Derive the Riccati equation equivalent to the Bellman equation, further simplifying the solution process. - **Actor - critic structure**: Use a neural network to implement the actor - critic structure to approximate the optimal policy and value function respectively. ### Conclusion This research proposes an innovative computational and mathematical framework for designing a model - free control scheme for nonlinear processes with unknown dynamic models. Through an online adaptive learning algorithm, effective control of the flexible - wing drone is achieved, and its superior performance and robustness are demonstrated in different scenarios.