Qauxi: Cooperative Multi-Agent Reinforcement Learning with Knowledge Transferred from Auxiliary Task

Wenqian Liang,Ji Wang,Weidong Bao,Xiaomin Zhu,Guanlin Wu,Dayu Zhang,Liyuan Niu
DOI: https://doi.org/10.1016/j.neucom.2022.06.091
IF: 6
2022-01-01
Neurocomputing
Abstract:Deep multi-agent reinforcement learning (MARL) can efficiently learn decentralized policies for real-world applications. However, current MARL methods suffer from the difficulty of transferring knowledge from already learned tasks to improve its exploration. In this paper, we propose a novel MARL method called Qauxi, which forms coordinated exploration scheme to improve the traditional MARL algorithms by reusing the meta-experience transferred from auxiliary task. We also use the weighting function to weight the importance of the joint action in monotonic loss function in order to focus on more important joint actions and thus avoid yielding suboptimal policies. Furthermore, we prove the convergence of Qauxi based on contraction mapping theorem. Qauxi is evaluated on the widely adopted StarCraft benchmarks (SMAC) across easy, hard, and super hard scenarios. Experimental results show that the proposed method outperforms the state-of-the-art MARL methods by a large margin in the most challenging super hard scenarios.
What problem does this paper attempt to address?