Reentry Trajectory Optimization Based on Deep Reinforcement Learning

Jiashi Gao,Xinming Shi,Zhongtao Cheng,Jizhang Xiong,Lei Liu,Yongji Wang,Ye Yang
DOI: https://doi.org/10.1109/ccdc.2019.8832559
2019-01-01
Abstract:This article solved the reentry optimization problem of RLV using the Deep Reinforcement Learning-Deep Deterministic Policy Gradient (DDPG) method for continuous system decision making. Compared with the traditional intelligent optimization algorithm, the DDPG algorithm trains appropriate action values for each state value during flight by constructing the action neural network and the critic neural network, avoiding the problems caused by the improper segmentation of traditional intelligent algorithms. And through the greedy algorithm, the optimization process is prevented from falling into local optimum. By comparing the trajectory optimization results with the particle swarm optimization algorithm, the effectiveness of the DDPG algorithm is verified. At the same time, the optimized trajectory of the DDPG algorithm has better smoothness, and the optimization process is not easy to fall into the local maximum.
What problem does this paper attempt to address?