A Cooperative Multi-Agent Reinforcement Learning Algorithm Based on Dynamic Self-Selection Parameters Sharing

WANG Han,YU Yang,JIANG Yuan
DOI: https://doi.org/10.11959/j.issn.2096−6652.202214
2022-01-01
Abstract:In multi-agent reinforcement learning, parameter sharing can effectively alleviate the inefficiency of learning caused by non-stationarity.However, maintaining the same policy forall agents during learning may have detrimental effects.To solve this problem, a new approach was introduced to give agents the ability to automatically identify agents that may benefit from parameter sharing and dynamically share parameters them during learning.Specifically, agents needed to encode empirical trajectories as implicit information that can represent their potential intentions, and selected peers to share parameters by comparing their intentions.Experiments show that the proposed method not only can improve the efficiency of parameter sharing, but also ensure the quality of policy learning in multi-agent system.
What problem does this paper attempt to address?