Abstract:Autonomous decision-making in unmanned aerial vehicle (UVA) confrontations presents challenges in making optimal strategy. Therefore, deep reinforcement learning (DRL) has been adopted to address these issues. However, existing DRL decision-making models suffer from poor situational awareness and inability to distinguish between different intentions. Therefore, a multi-intent autonomous decision-making is proposed in this paper. First, three typical intentions are designed comprising head-on attacking, pursuing and fleeing to derive decision models representing different intentions. Reinforcement learning based air combat game model is constructed with different intentions, which contains designing reward functions for intentions to deal with the problem of sparse rewards. Then, we propose the Temporal Proximal Policy Optimization (T-PPO) algorithm, which optimizes the Proximal Policy Optimization algorithm by integrating the long short-term memory network and feedforward neural network. This algorithm extracts the historical temporal information to enhance situational awareness. In addition, a basic-confrontation progressive training method is proposed to provide intention guidance and increase training diversity, which can improve learning efficiency and intelligent decision-making capability. Finally, experiments in our constructed UAV confrontation environment demonstrate that the proposed intentional decision models exhibit good performance in stability and learning efficiency, achieving high rewards, win rates, and low steps. Specifically, our autonomous decision-making increases win rate by 26% when head-on attacking and learning efficiency by 50% when pursuing. It is further proof of the potential and value of our multi-intent autonomous decision-making applications.

Air-to-ground Shepherd Problem: an Action-Delay Reinforcement Learning Approach

Hierarchical Decision and Control for Continuous Multitarget Problem: Policy Evaluation with Action Delay

Model-free Maneuvering Control of Fixed-Wing UAVs Based on Deep Reinforcement Learning

Learning to Herd Agents Amongst Obstacles: Training Robust Shepherding Behaviors Using Deep Reinforcement Learning

Path Planning of Unmanned Aerial Vehicle in Complex Environments Based on State-Detection Twin Delayed Deep Deterministic Policy Gradient

Deep Reinforcement Learning With Application to Air Confrontation Intelligent Decision-Making of Manned/Unmanned Aerial Vehicle Cooperative System

Group-Based Deep Reinforcement Learning in Multi-UAV Confrontation

Trajectory Design and Access Control for Air–Ground Coordinated Communications System With Multiagent Deep Reinforcement Learning

Multi-intent autonomous decision-making for air combat with deep reinforcement learning

Autonomous Decision Making for UAV Cooperative Pursuit-Evasion Game with Reinforcement Learning

Reactive shepherding along a dynamic path

Asynchronous Curriculum Experience Replay: A Deep Reinforcement Learning Approach for UAV Autonomous Motion Control in Unknown Dynamic Environments

Scheduling Drone and Mobile Charger via Hybrid-Action Deep Reinforcement Learning

Continuous Deep Hierarchical Reinforcement Learning for Ground-Air Swarm Shepherding

Deep Reinforcement Learning for Time-Critical Wilderness Search And Rescue Using Drones

A UAV Maneuver Decision-Making Algorithm for Autonomous Airdrop Based on Deep Reinforcement Learning

3D-Trajectory and Phase-Shift Design for RIS-Assisted UAV Systems Using Deep Reinforcement Learning

Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients

Multi-UAV Speed Control with Collision Avoidance and Handover-aware Cell Association: DRL with Action Branching

Deep Reinforcement Learning for Autonomous Ground Vehicle Exploration Without A-Priori Maps

Multi-UAV simultaneous target assignment and path planning based on deep reinforcement learning in dynamic multiple obstacles environments