UCAV Autonomous Maneuvering Decision Based on Curriculum Learning Mechanism Training

Shiyu Fang,Wenjie Zhao,Jun Li,Yanjun Shen
DOI: https://doi.org/10.1117/12.3010649
2023-01-01
Abstract:Based on the Proximal Policy Optimization with clipped objective (PPO-clip) algorithm framework, an autonomous maneuver decision-making method for short-range 1v1 unmanned combat aircraft vehicles (UCAVs) is designed and implemented. In this paper, the curriculum learning (CL) mechanism is used to train the maneuver decision-making model to solve the problem that the model cannot converge when fighting against complex maneuvering enemy UCAV. The entire training process is divided into 4 stages to fight against enemy UCAV, which maneuvers range from simple to complex, and finally achieve our UCAV against the enemy UCAV with intelligent maneuvers. Through four groups of simulation experiments, this paper proves the effectiveness of the PPO-clip algorithm and the curriculum learning mechanism that can speed up the convergence of the model.
What problem does this paper attempt to address?