Research on Deep Q-Network Hybridization with Extended Kalman Filter in Maneuvering Decision of Unmanned Combat Aerial Vehicles

Juntao Ruan,Yi Qin,Fei Wang,Jianjun Huang,Fujie Wang,Fang Guo,Yaohua Hu
DOI: https://doi.org/10.3390/math12020261
IF: 2.4
2024-01-13
Mathematics
Abstract:To adapt to the development trend of intelligent air combat, it is necessary to research the autonomous generation of maneuvering decisions for unmanned combat aerial vehicles (UCAV). This paper presents a maneuver decision-making method for UCAV based on a hybridization of deep Q-network (DQN) and extended Kalman filtering (EKF). Firstly, a three-dimensional air combat simulation environment is constructed, and a flight motion model of UCAV is designed to meet the requirements of the simulation environment. Secondly, we evaluate the current situation of UCAV based on their state variables in air combat, for further network learning and training to obtain the optimal maneuver strategy. Finally, based on the DQN, the system state equation is constructed using the uncertain parameter values of the current network, and the observation equation of the system is constructed using the parameters of the target network. The optimal parameter estimation value of the DQN is obtained by iteratively updating the solution through EKF. Simulation experiments have shown that this autonomous maneuver decision-making method hybridizing DQN with EKF is effective and reliable, as it can eliminate the opponent and preserve its side.
mathematics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to realize the autonomous generation of maneuvering decisions for Unmanned Combat Aerial Vehicles (UCAVs) in the context of intelligent air combat. Specifically, the paper proposes a method based on the fusion of Deep Q - Network (DQN) and Extended Kalman Filter (EKF), aiming to improve the UCAV's autonomous maneuvering decision - making ability in complex and dynamic aerial environments. Through this method, the UCAV can more effectively eliminate opponents and protect itself in combat, thereby enhancing its effectiveness and reliability in intelligent air combat. The paper mainly focuses on the following aspects: 1. **Constructing a three - dimensional air combat simulation environment**: A UCAV flight motion model that meets the simulation requirements is designed. 2. **Evaluating the current UCAV state**: Evaluation is based on the state variables of the UCAV in air combat, so as to further carry out network learning and training and obtain the optimal maneuvering strategy. 3. **Fusing DQN and EKF**: Use DQN to generate the system state equation, and use the parameters of the target network to construct the system's observation equation. Through EKF iterative update of the solution, the optimal parameter estimation value of DQN is obtained. Through these methods, the paper aims to solve the problem of parameter estimation deviation in the existing DQN algorithm caused by the influence of the experience pool and the target network during the parameter update process, thereby improving the stability and convergence of UCAV maneuvering decision - making.