Multi-robot Cooperation Strategy in a Partially Observable Markov Game Using Enhanced Deep Deterministic Policy Gradient

Qirong Tang,Jingtao Zhang,Fangchao Yu,Pengjie Xu,Zhongqun Zhang
DOI: https://doi.org/10.1007/978-3-030-26354-6_1
2019-01-01
Abstract:Deep reinforcement learning (DRL) has been applied to solve challenging problems in robotic domains. However, since non-stationary of the environment and the difficulty of long-term interaction between robots, traditional DRL is poorly suitable for multi-robot. Thus, an enhanced deep deterministic policy gradient algorithm is proposed in this study to explore the application of DRL in multi-robot domains. The algorithm ensures a cooperation strategy for multi-robot, which merely uses partially observed state of each robot, named a partially observable Markov game, realize global optimality in executing process. It is achieved by eliminating non-stationary of the environment in training process and a centralized critic for decentralized multi-robot. Simulations with increasingly complex environments are performed to validate the effectiveness of the proposed algorithm.
What problem does this paper attempt to address?