D3PG: Decomposed Deep Deterministic Policy Gradient for Continuous Control

Yinzhao Dong,Chao Yu,Hongwei Ge
DOI: https://doi.org/10.1007/978-3-030-64096-5_4
2020-01-01
Abstract:In this paper, we study how structural decomposition and multiagent interactions can be utilized by deep reinforcement learning in order to address high dimensional robotic control problems. In this regard, we propose the D3PG approach, which is a multiagent extension of DDPG by decomposing the global critic into a weighted sum of local critics. Each of these critics is modeled as an individual learning agent that governs the decision making of a particular joint of a robot. We then propose a method to learn the weights during learning in order to capture different levels of dependencies among the agents. The experimental evaluation demonstrates that D3PG can achieve competitive or significantly improved performance compared to some widely used deep reinforcement learning algorithms. Another advantage of D3PG is that it is able to provide explicit interpretations of the final learned policy as well as the underlying dependencies among the joints of a learning robot.
What problem does this paper attempt to address?