Hypersonic Vehicle Attitude-Tracking Control Using Model-Free Deep Reinforcement Learning

Dan Wang
DOI: https://doi.org/10.1088/1742-6596/2383/1/012068
2022-01-01
Journal of Physics Conference Series
Abstract:In this paper, one of the state-of-the-art deep reinforcement learning (DRL) algorithms, the proximal policy optimization (PPO) algorithm is used to achieve the learning of the attitude controller of hypersonic vehicles (HSVs). The HSV attitude-tracking control problem belongs to continuous control problems, which means the state space and the action space of the HSV are both continuous. The goal of DRL is to maximize accumulated rewards rather than the instant reward, so a fixed time length training episode and a normalized instantaneous reward function are presented to facilitate a comparison of each episodic return. Numerical simulation results show the proposed training manner can effectively make the learning controller track the attitudes for the longitudinal motion of HSVs with uncertainties. This training manner also can generalize to other continuous control problems.
What problem does this paper attempt to address?