Abstract:With the development of unmanned aerial vehicle (UAV) and artificial intelligence (AI) technology, Intelligent UAV will be widely used in future autonomous aerial combat. Previous researches on autonomous aerial combat within visual range (WVR) have limitations due to simplifying assumptions, limited robustness, and ignoring sensor errors. In this paper, in order to consider the error of the aircraft sensors, we model the aerial combat WVR as a state-adversarial Markov decision process (SA-MDP), which introduce the small adversarial perturbations on state observations and these perturbations do not alter the environment directly, but can mislead the agent into making suboptimal decisions. Meanwhile, we propose a novel autonomous aerial combat maneuver strategy generation algorithm with high-performance and high-robustness based on state-adversarial deep deterministic policy gradient algorithm (SA-DDPG), which add a robustness regularizers related to an upper bound on performance loss at the actor-network. At the same time, a reward shaping method based on maximum entropy (MaxEnt) inverse reinforcement learning algorithm (IRL) is proposed to improve the aerial combat strategy generation algorithm’s efficiency. Finally, the efficiency of the aerial combat strategy generation algorithm and the performance and robustness of the resulting aerial combat strategy is verified by simulation experiments. Our main contributions are three-fold. First, to introduce the observation errors of UAV, we are modeling air combat as SA-MDP. Second, to make the strategy network of air combat maneuver more robust in the presence of observation errors, we introduce regularizers into the policy gradient. Third, to solve the problem that air combat’s reward function is too sparse, we use MaxEnt IRL to design a shaping reward to accelerate the convergence of SA-DDPG.

Aircraft Upset Recovery Strategy and Pilot Assistance System Based on Reinforcement Learning

Two-Stage Strategy to Achieve a Reinforcement Learning-Based Upset Recovery Policy for Aircraft

Deep reinforcement learning-based upset recovery control for generic transport aircraft

Control reconfiguration design for control surface fault of small unmanned aerial vehicle

Adaptive Sliding Mode Control with Transition Process for Flapping Wing Aerial Vehicle

Model Predictive Control Based Washout Algorithm Design for Flight Simulator Upset Prevention and Recovery Training

UAV Autonomous Aerial Combat Maneuver Strategy Generation with Observation Error Based on State-Adversarial Deep Deterministic Policy Gradient and Inverse Reinforcement Learning

Re-entry vehicle autopilot design using dynamic inversion with L1 adaptive control augmentation

Dynamic Control Allocation between Onboard and Delayed Remote Control for Unmanned Aircraft System Detect-and-Avoid

Machine Learning Based Flight State Prediction for Improving UAV Resistance to Uncertainty

Control System Architecture for Automatic Recovery of Fixed-Wing Unmanned Aerial Vehicles in a Moving Arrest System

Risk Analysis of Airplane Upsets in Flight: An Integrated System Framework and Analysis Methodology

Online Safe Flight Control Method Based on Constraint Reinforcement Learning

Aircraft Ground Braking Assistant Control Based On Pilot Control Model

Research of key technologies for multi-rotor UAV automatic aerial recovery system

Fault-tolerant Control for Unmanned Aerial Vehicle Using Deep Reinforcement Learning

Assessment of piloting behavior impact on landing risk of transport aircraft

Full-Altitude Attitude Angles Envelope and Model Predictive Control-Based Attitude Angles Protection for Civil Aircraft

Analysis of Human Factors in Typical Accident Tests of Certain Type Flight Simulator

Missed Approach, a Safety-Critical Go-Around Procedure in Aviation: Prediction Based on Machine Learning-Ensemble Imbalance Learning

Adaptive PI Control Based Stability Margin Configuration of Aircraft Control Systems with Unknown System Parameters and Time Delay