Abstract:With the development of unmanned aerial vehicle (UAV) and artificial intelligence (AI) technology, Intelligent UAV will be widely used in future autonomous aerial combat. Previous researches on autonomous aerial combat within visual range (WVR) have limitations due to simplifying assumptions, limited robustness, and ignoring sensor errors. In this paper, in order to consider the error of the aircraft sensors, we model the aerial combat WVR as a state-adversarial Markov decision process (SA-MDP), which introduce the small adversarial perturbations on state observations and these perturbations do not alter the environment directly, but can mislead the agent into making suboptimal decisions. Meanwhile, we propose a novel autonomous aerial combat maneuver strategy generation algorithm with high-performance and high-robustness based on state-adversarial deep deterministic policy gradient algorithm (SA-DDPG), which add a robustness regularizers related to an upper bound on performance loss at the actor-network. At the same time, a reward shaping method based on maximum entropy (MaxEnt) inverse reinforcement learning algorithm (IRL) is proposed to improve the aerial combat strategy generation algorithm’s efficiency. Finally, the efficiency of the aerial combat strategy generation algorithm and the performance and robustness of the resulting aerial combat strategy is verified by simulation experiments. Our main contributions are three-fold. First, to introduce the observation errors of UAV, we are modeling air combat as SA-MDP. Second, to make the strategy network of air combat maneuver more robust in the presence of observation errors, we introduce regularizers into the policy gradient. Third, to solve the problem that air combat’s reward function is too sparse, we use MaxEnt IRL to design a shaping reward to accelerate the convergence of SA-DDPG.

End-to-end UAV obstacle avoidance decision based on deep reinforcement learning

Autonomous obstacle avoidance of UAV based on deep reinforcement learning

Autonomous Decision Making for UAV Cooperative Pursuit-Evasion Game with Reinforcement Learning

Autonomous UAV Navigation with Adaptive Control Based on Deep Reinforcement Learning

Deep Reinforcement Learning With Application to Air Confrontation Intelligent Decision-Making of Manned/Unmanned Aerial Vehicle Cooperative System

Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of UCAVs

End-to-end UAV Intelligent Training via Deep Reinforcement Learning

UAV Obstacle Avoidance by Human-in-the-Loop Reinforcement in Arbitrary 3D Environment

Autonomous maneuver decision-making for a UCAV in short-range aerial combat based on an MS-DDQN algorithm

Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning

Vision Based Drone Obstacle Avoidance by Deep Reinforcement Learning

Generalization Strategy Design of UAVs Pursuit Evasion Game Based on DDPG

A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle Avoidance

A decision-making of autonomous driving method based on DDPG with pretraining

Autonomous Decision-Making Method for Combat Mission of UAV based on Deep Reinforcement Learning

Deep-reinforcement-learning-based UAV autonomous navigation and collision avoidance in unknown environments

Autonomous quadrotor obstacle avoidance based on dueling double deep recurrent Q-learning with monocular vision

UAV maneuver decision-making via deep reinforcement learning for short-range air combat

AUV path tracking with real-time obstacle avoidance via reinforcement learning under adaptive constraints

UAV Autonomous Aerial Combat Maneuver Strategy Generation with Observation Error Based on State-Adversarial Deep Deterministic Policy Gradient and Inverse Reinforcement Learning

Obstacle Avoidance for UAS in Continuous Action Space Using Deep Reinforcement Learning