Progressive Diversifying Policy for Multi-Agent Reinforcement Learning

Shaoqi Sun,Yuanzhao Zhai,Kele Xu,Dawei Feng,Bo Ding
DOI: https://doi.org/10.1109/ICASSP49357.2023.10096125
2023-01-01
Abstract:Multi-Agent Reinforcement Learning (MARL) has recently achieved promising performance in many collaborative decision making tasks. However, one of the main bottleneck challenges for MARL is the sparsity of the team reward, which can lead to the homogenization of agents’ behaviors. To address these issues, we propose a Progressive Diversifying Policy (PDP) algorithm in this paper. Specifically, we propose to actively amplify the diversity between agents’ policies during learning and exploit diversity as an additional intrinsic reward for MARL. Furthermore, we propose a progressive diversity boosting policy to find a better team policy. Leveraging the aforementioned improvements, our method can handle sparse team rewards and alleviate the homogeneous behaviors of agents. We conduct experiments on widely-used MARL environments and the results show that PDP can provide state-of-the-art performance while maintaining a competitive convergence speed.
What problem does this paper attempt to address?