Parallel Knowledge Transfer in Multi-Agent Reinforcement Learning

Yongyuan Liang,Bangwei Li
DOI: https://doi.org/10.48550/arXiv.2003.13085
2020-03-30
Abstract:Multi-agent reinforcement learning is a standard framework for modeling multi-agent interactions applied in real-world scenarios. Inspired by experience sharing in human groups, learning knowledge parallel reusing between agents can potentially promote team learning performance, especially in multi-task environments. When all agents interact with the environment and learn simultaneously, how each independent agent selectively learns from other agents' behavior knowledge is a problem that we need to solve. This paper proposes a novel knowledge transfer framework in MARL, PAT (Parallel Attentional Transfer). We design two acting modes in PAT, student mode and self-learning mode. Each agent in our approach trains a decentralized student actor-critic to determine its acting mode at each time step. When agents are unfamiliar with the environment, the shared attention mechanism in student mode effectively selects learning knowledge from other agents to decide agents' actions. PAT outperforms state-of-the-art empirical evaluation results against the prior advising approaches. Our approach not only significantly improves team learning rate and global performance, but also is flexible and transferable to be applied in various multi-agent systems.
Artificial Intelligence,Machine Learning,Multiagent Systems
What problem does this paper attempt to address?
This paper attempts to solve the problem of how to optimize knowledge transfer among agents in multi - agent reinforcement learning (MARL), especially in cooperative multi - agent systems in joint - task or multi - task environments under local constraints. Specifically, the paper focuses on how each independent agent selectively learns from the behavioral knowledge of other agents when all agents interact with the environment and learn simultaneously. This involves three main aspects: 1. **Knowledge Transfer Decision**: Determine when knowledge needs to be acquired from other agents. 2. **Knowledge Selection**: Select the most appropriate source of knowledge from multiple possible teacher agents. 3. **Knowledge Utilization**: Effectively use the selected knowledge to guide one's own behavior. To solve these problems, the paper proposes a new knowledge transfer framework - PAT (Parallel Attentional Transfer). PAT enables student agents to dynamically extract useful knowledge from the experiences of other agents by introducing an attention mechanism. The specific implementation methods include: - **Two Action Modes**: PAT designs two action modes - student mode and self - learning mode. In the student mode, an agent will decide its action according to the suggestions of other agents; in the self - learning mode, an agent will decide its action based on the behavioral knowledge it has independently learned. - **Attention Mechanism**: In the student mode, PAT uses the attention mechanism as a teacher selector to select the most appropriate teacher according to the familiarity of the teacher agent with the current state and the effectiveness of the current strategy. This helps the student agent obtain the optimal suggestions from the most reliable teacher. - **Flexible Role Switching**: In the PAT framework, the role of an agent is not fixed. An agent can be both a student and a teacher, which means it can learn from other agents and at the same time can share its own experience with other agents. In this way, PAT not only improves the overall learning efficiency and performance of the team, but also has high flexibility and scalability and is suitable for various multi - agent systems. Experimental results show that PAT performs better than existing methods in multiple - task and joint - task scenarios, especially when the number of agents increases, PAT can still maintain good performance.