Decentralized Multi-Task Reinforcement Learning Policy Gradient Method with Momentum over Networks.

Junru Shi,Qiong Wang,Muhua Liu,Zhihang Ji,Ruijuan Zheng,Qingtao Wu
DOI: https://doi.org/10.1007/s10489-022-04028-8
IF: 5.3
2022-01-01
Applied Intelligence
Abstract:To find the optimal policy quickly for reinforcement learning problems, policy gradient (PG) method is very effective, it parameters the policy and updates policy parameter directly. Besides, momentum methods are commonly employed to improve convergence performance in the training of centralized deep networks, which can accelerate training rate by changing the descending direction of gradients. However, decentralized variants with momentum of PG are rarely investigated. For this reason, we propose a Decentralized Policy Gradient algorithm with Momentum called DPGM for solving multi-task reinforcement learning problems. Moreover, this article makes theoretical analysis on the convergence performance of DPGM rigorously, it can reach the rate of O (1/ T ), where T denotes the number of iterations. This rate can match the state of the art of decentralized PG methods. Furthermore, we provide experimental verification on decentralized reinforcement learning environment to support the theoretical result.
What problem does this paper attempt to address?