Abstract:Reinforcement learning has been applied to air combat problems in recent years, and the idea of curriculum learning is often used for reinforcement learning, but traditional curriculum learning suffers from the problem of plasticity loss in neural networks. Plasticity loss is the difficulty of learning new knowledge after the network has converged. To this end, we propose a motivational curriculum learning distributed proximal policy optimization (MCLDPPO) algorithm, through which trained agents can significantly outperform the predictive game tree and mainstream reinforcement learning methods. The motivational curriculum learning is designed to help the agent gradually improve its combat ability by observing the agent's unsatisfactory performance and providing appropriate rewards as a guide. Furthermore, a complete tactical maneuver is encapsulated based on the existing air combat knowledge, and through the flexible use of these maneuvers, some tactics beyond human knowledge can be realized. In addition, we designed an interruption mechanism for the agent to increase the frequency of decision-making when the agent faces an emergency. When the number of threats received by the agent changes, the current action is interrupted in order to reacquire observations and make decisions again. Using the interruption mechanism can significantly improve the performance of the agent. To simulate actual air combat better, we use digital twin technology to simulate real air battles and propose a parallel battlefield mechanism that can run multiple simulation environments simultaneously, effectively improving data throughput. The experimental results demonstrate that the agent can fully utilize the situational information to make reasonable decisions and provide tactical adaptation in the air combat, verifying the effectiveness of the algorithmic framework proposed in this paper.

Mastering air combat game with deep reinforcement learning

Maneuver Decision-Making For Autonomous Air Combat Through Curriculum Learning And Reinforcement Learning With Sparse Rewards

Maneuver Decision-Making Through Automatic Curriculum Reinforcement Learning Without Handcrafted Reward functions

Improving Maneuver Strategy in Air Combat by Alternate Freeze Games with a Deep Reinforcement Learning Algorithm

Learning and Fast Adaptation for Air Combat Decision with Improved Deep Meta-reinforcement Learning

Deep Reinforcement-Learning-Based Air-Combat-Maneuver Generation Framework

Air Combat Maneuver Decision Based on Deep Reinforcement Learning and Game Theory

Hierarchical Multi-Agent Reinforcement Learning for Air Combat Maneuvering

Hierarchical Reinforcement Learning from Competitive Self-play for Dual-aircraft formation air combat

Deep Reinforcement Learning With Application to Air Confrontation Intelligent Decision-Making of Manned/Unmanned Aerial Vehicle Cooperative System

UAV Cooperative Air Combat Maneuvering Confrontation Based on Multi-agent Reinforcement Learning

H3E: Learning air combat with a three-level hierarchical framework embedding expert knowledge

Discovering Expert-Level Air Combat Knowledge via Deep Excitatory-Inhibitory Factorized Reinforcement Learning

Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future direction

Air Combat Maneuver Decision Method Based on A3C Deep Reinforcement Learning

Self-play Decision-making Method of Deep Reinforcement Learning Guided by Behavior Tree under Complex Environment

Enhancing multi-UAV air combat decision making via hierarchical reinforcement learning

Multi-intent autonomous decision-making for air combat with deep reinforcement learning

Mastering Complex Control in MOBA Games with Deep Reinforcement Learning

Hierarchical Reinforcement Learning Framework in Geographic Coordination for Air Combat Tactical Pursuit

Air-Combat Strategy Using Deep Q-Learning