Interpretable DRL-based Maneuver Decision of UCAV Dogfight

Haoran Han,Jian Cheng,Maolong Lv
2024-05-28
Abstract:This paper proposes a three-layer unmanned combat aerial vehicle (UCAV) dogfight frame where Deep reinforcement learning (DRL) is responsible for high-level maneuver decision. A four-channel low-level control law is firstly constructed, followed by a library containing eight basic flight maneuvers (BFMs). Double deep Q network (DDQN) is applied for BFM selection in UCAV dogfight, where the opponent strategy during the training process is constructed with DT. Our simulation result shows that, the agent can achieve a win rate of 85.75% against the DT strategy, and positive results when facing various unseen opponents. Based on the proposed frame, interpretability of the DRL-based dogfight is significantly improved. The agent performs yo-yo to adjust its turn rate and gain higher maneuverability. Emergence of "Dive and Chase" behavior also indicates the agent can generate a novel tactic that utilizes the drawback of its opponent.
Robotics,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the maneuver decision - making ability of Unmanned Combat Aerial Vehicles (UCAVs) in air - to - air combat, especially in the complex 6 - Degrees - of - Freedom (6 - DOF) dynamic environment. Specifically, the author proposes a three - layer UCAV air - combat framework based on Deep Reinforcement Learning (DRL) to enhance the interpretability and performance of the DRL model in air - to - air combat. ### Main problems: 1. **Limitations of existing methods**: - Traditional air - combat decision - making methods such as Decision Trees (DT) or state machines are easy to implement and interpret, but they cannot cover all complex air - combat situations. - Most of the existing DRL research focuses on simplified 3 - Degrees - of - Freedom (3 - DOF) dynamic models, which are far from the actual air - combat scenarios. - Most DRL research only claims that their algorithms are superior to traditional methods, but lack in - depth explanations of the DRL agent's behavior. 2. **Improving interpretability**: - The paper aims to improve the interpretability of DRL agents in air - to - air combat by introducing an interpretable Basic Flight Maneuver (BFM) library and an improved DRL algorithm. 3. **Improving performance**: - The proposed framework not only improves the winning rate of DRL agents (reaching 85.75%), but also shows that the agents can generate new tactics, such as "Dive and Chase", and take advantage of the opponents' weaknesses. ### Solutions: 1. **Three - layer UCAV air - combat framework**: - **Low - level control laws**: A four - channel low - level control law is designed, including the control of speed, angle of attack, roll angle, and sideslip angle. - **BFM library**: A library containing eight commonly used basic flight maneuvers is constructed, such as position tracking, attitude tracking, straight flight, climbing, etc. - **High - level decision - making**: A Double Deep Q - Network (DDQN) is used to select BFMs to deal with the complex air - combat environment. 2. **Interpretability analysis**: - Through posterior analysis of the agent's behavior, it is found that the agent can adjust the turning rate through "yo - yo", thus obtaining higher maneuverability. - The emergence of the "Dive and Chase" strategy is observed, indicating that the agent can generate novel tactics in complex situations. 3. **Simulation verification**: - A large number of simulation experiments are carried out using the F16 model to verify the effectiveness and robustness of the proposed framework. ### Summary: This paper solves the limitations of existing methods in complex air - combat environments by proposing a three - layer UCAV air - combat framework based on DRL, and improves the interpretability and performance of DRL agents.