An Application of Continuous Deep Reinforcement Learning Approach to Pursuit-Evasion Differential Game

Maolin Wang,Lixin Wang,Ting Yue
DOI: https://doi.org/10.1109/itnec.2019.8729310
2019-01-01
Abstract:Pursuit-evasion differential game is a classic decision-making process in continuous domain. Most recently, the reinforcement learning (RL) technique has greatly advanced the research in decision-making field. In this paper, the dynamic model of the game is described and the optimization problem of the purser in the game is addressed. To learn the control strategy with self-learning, reinforcement learning is considered. An actor-critic based, model-free, end-to-end approach Deep Deterministic Policy Gradient (DDPG) Algorithm is applied to train the pursuer. In the first training phase the pursuer is trained only with a given evader’s control strategy. In the second training phase, the pursuer and evader are trained simultaneously without any expert knowledge given in advance. The result shows that the pursuer and the evader can learn the control strategy during the training phase.
What problem does this paper attempt to address?