Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future direction

Xinwei Wang,Yihui Wang,Xichao Su,Lei Wang,Chen Lu,Haijun Peng,Jie Liu
DOI: https://doi.org/10.1007/s10462-023-10620-2
IF: 9.588
2023-12-28
Artificial Intelligence Review
Abstract:Nowadays, various innovative air combat paradigms that rely on unmanned aerial vehicles (UAVs), i.e., UAV swarm and UAV-manned aircraft cooperation, have received great attention worldwide. During the operation, UAVs are expected to perform agile and safe maneuvers according to the dynamic mission requirement and complicated battlefield environment. Deep reinforcement learning (DRL), which is suitable for sequential decision-making process, provides a powerful solution tool for air combat maneuver decision-making (ACMD), and hundreds of related research papers have been published in the last five years. However, as an emerging topic, there lacks a systematic review and tutorial. For this reason, this paper first provides a comprehensive literature review to help people grasp a whole picture of this field. It starts from the DRL itself and then extents to its application in ACMD. And special attentions are given to the design of reward function, which is the core of DRL-based ACMD. Then, a maneuver decision-making method based on one-to-one dogfight scenarios is proposed to enable UAV to win short-range air combat. The model establishment, program design, training methods and performance evaluation are described in detail. And the associated Python codes are available at gitee.com/wangyyhhh, thus enabling a quick-start for researchers to build their own ACMD applications by slight modifications. Finally, limitations of the considered model, as well as the possible future research direction for intelligent air combat, are also discussed.
computer science, artificial intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is how Unmanned Aerial Vehicles (UAVs) can perform agile and safe maneuvering operations in modern air combat based on dynamic mission requirements and complex battlefield environments. Specifically, the paper focuses on how to utilize Deep Reinforcement Learning (DRL) technology to achieve this goal, particularly in maneuvering decision-making (ACMD) in one-on-one air combat scenarios. The paper points out that although DRL's applicability in sequential decision processes makes it a powerful tool for solving ACMD problems, as an emerging field, it lacks systematic reviews and tutorials. Therefore, this paper aims to help researchers better understand and apply DRL technology to ACMD by providing a comprehensive literature review, implementation tutorial, and future research directions. The main contributions of the paper include: 1. **Literature Review**: A comprehensive review of DRL and its application in ACMD, with a particular focus on the design of reward functions, which is the core of DRL-based ACMD. 2. **Implementation Tutorial**: Proposes a maneuvering decision-making method based on one-on-one air combat scenarios, detailing model building, program design, training methods, and performance evaluation, and provides relevant Python code to enable researchers to quickly start their own ACMD applications. 3. **Future Directions**: Discusses the limitations of the considered models and proposes possible directions for future intelligent air combat research. In summary, this paper is dedicated to filling the gap in systematic resources on DRL in the ACMD field, providing researchers with a comprehensive guide from theory to practice.