A Fuzzy Deterministic Policy Gradient Algorithm for Pursuit-Evasion Differential Games.

Lixin Wang,Maolin Wang,Ting Yue
DOI: https://doi.org/10.1016/j.neucom.2019.07.038
IF: 6
2019-01-01
Neurocomputing
Abstract:Fuzzy inference systems with reinforcement learning are currently being used in differential games to train agents with no prior experience. However, the reinforcement learning algorithms based on actor-critic structure have a drawback that the policy is depended on a probability distribution. In this paper, a novel fuzzy deterministic policy gradient algorithm is introduced and applied to classical 1-vs-1 constant-velocity pursuit-evasion differential games. The key goal is to self-learn the optimal strategy in the continuous action domain and obtain a specific physical meaning of the fuzzy rules. The novel proposed algorithm is based on the deterministic policy gradient theorem and the agent learns the near-optimal strategy under the actor-critic structure. The fuzzy inference system is applied as approximators so that the specific physical meaning can be obtained by the linguistic fuzzy rules. Furthermore, the proposed algorithm is applied to solve the decision-making problem of pursuit-evasion differential games. The result is compared with other existing algorithms and it elucidates that the proposed algorithm outperforms the precision and convergence efficiency.
What problem does this paper attempt to address?