Abstract:The reconnaissance of high-value targets is prerequisite for effective operations. The recent appreciation of deep reinforcement learning (DRL) arises from its success in navigation problems, but due to the competitiveness and complexity of the military field, the applications of DRL in the military field are still unsatisfactory. In this paper, an end-to-end DRL-based intelligent reconnaissance mission planning is proposed for dual unmanned aerial vehicle (dual UAV) cooperative reconnaissance missions under high-threat and dense situations. Comprehensive consideration is given to specific mission properties and parameter requirements through the whole modelling. Firstly, the reconnaissance mission is described as a Markov decision process (MDP), and the mission planning model based on DRL is established. Secondly, the environment and UAV motion parameters are standardized to input the neural network, aiming to deduce the difficulty of algorithm convergence. According to the concrete requirements of non-reconnaissance by radars, dual-UAV cooperation and wandering reconnaissance in the mission, four reward functions with weights are designed to enhance agent understanding to the mission. To avoid sparse reward, the clip function is used to control the reward value range. Finally, considering the continuous action space of reconnaissance mission planning, the widely applicable proximal policy optimization (PPO) algorithm is used in this paper. The simulation is carried out by combining offline training and online planning. By changing the location and number of ground detection areas, from 1 to 4, the model with PPO can maintain 20% of reconnaissance proportion and a 90% mission complete rate and help the reconnaissance UAV to complete efficient path planning. It can adapt to unknown continuous high-dimensional environmental changes, is generalizable, and reflects strong intelligent planning performance.

Deep reinforcement learning for unmanned aerial vehicles cluster task allocation

DL-DRL: A double-level deep reinforcement learning approach for large-scale task scheduling of multi-UAV

Digital Twin Assisted Task Assignment in Multi-UAV Systems: A Deep Reinforcement Learning Approach

Task Assignment for UAV Swarm Saturation Attack: A Deep Reinforcement Learning Approach

Task Allocation of Multiple Unmanned Aerial Vehicles Based on Deep Transfer Reinforcement Learning

Multi-UAV simultaneous target assignment and path planning based on deep reinforcement learning in dynamic multiple obstacles environments

Deep Reinforcement Learning for UAV Intelligent Mission Planning

Resource Allocation in UAV-Assisted Networks: A Clustering-Aided Reinforcement Learning Approach

A resource-constrained distributed task allocation method based on a two-stage coalition formation methodology for multi-UAVs

Deep Reinforcement Learning for Intelligent Dual-UAV Reconnaissance Mission Planning

Deep Reinforcement Learning With Application to Air Confrontation Intelligent Decision-Making of Manned/Unmanned Aerial Vehicle Cooperative System

Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning

Collaborative Coverage Path Planning of UAV Cluster based on Deep Reinforcement Learning

Reinforcement Learning Assisted Multi-UAV Task Allocation and Path Planning for IIoT

Group-Based Deep Reinforcement Learning in Multi-UAV Confrontation

Autonomous maneuver decision-making for a UCAV in short-range aerial combat based on an MS-DDQN algorithm

Resource Allocation in UAV-D2D Networks: A Scalable Heterogeneous Multi-Agent Deep Reinforcement Learning Approach

Asynchronous Curriculum Experience Replay: A Deep Reinforcement Learning Approach for UAV Autonomous Motion Control in Unknown Dynamic Environments

Joint Task Offloading and Resource Allocation in Multi-UAV Multi-Server Systems: An Attention-based Deep Reinforcement Learning Approach

Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of UCAVs

Dynamic Reallocation Model of Multiple Unmanned Aerial Vehicle Tasks in Emergent Adjustment Scenarios