Learning-Based Optimal Cooperative Formation Tracking Control for Multiple UAVs: A Feedforward-Feedback Design Framework
Boyang Zhang,Maolong Lv,Shaohua Cui,Xiangwei Bu,Ju H. Park
DOI: https://doi.org/10.1109/tase.2023.3322028
IF: 6.636
2024-01-01
IEEE Transactions on Automation Science and Engineering
Abstract:Notwithstanding the successful design of state-of-the-art cooperative control protocols to accomplish formation tracking for multiple unmanned aerial vehicles (UAVs), the assurance of performance optimality cannot be guaranteed in the face of complex disturbances affecting these multi-UAV systems. In order to surmount this challenge, this research endeavor aims to establish a feedforward-feedback learning-based optimal control methodology to facilitate cooperative UAV formation tracking in the presence of intricate disturbances. To be more precise, by leveraging backstepping-based feedback control, the problem of UAV formation tracking is transformed into an equivalent optimal regulation problem. Consequently, a learning-based feedforward control scheme is devised, wherein the cooperative policy iteration algorithm is formulated based on a two-player zero-sum game. The critic-only echo state network (ESN) is employed to approximate the optimal feedforward control policies, with the inclusion of an online adaptive tuning law and compensation terms to alleviate the persistence of excitation condition and eliminate the need for an initial admissible control. As a result, the closed-loop stability is guaranteed in terms of uniformly ultimately boundedness for tracking errors and ESN weights. Note to Practitioners—In real-world scenarios, the flight of multiple UAVs is invariably affected by intricate disturbances, resulting in compromised tracking precision. There is an urgent need to enhance resistance to disturbances and ensure optimal performance for cooperative formation tracking of multiple UAVs. Beyond the capabilities of model-based controllers, the integration of reinforcement learning has shown promise in achieving robust control actions. By introducing the cooperative policy iteration algorithm based on a two-player zero-sum game, the tracking performances of UAV formation can be further optimized. In order to facilitate the practical application of reinforcement learning in UAV systems, our proposed algorithm addresses the persistency of excitation condition by incorporating innovative compensation terms into the ESN tuning law. Furthermore, we resolve the requirement for initial admissible control by introducing a novel piecewise compensation term into the ESN tuning law, which is based on a newly proposed Lyapunov function.
automation & control systems