Self-Attention Guided Advice Distillation in Multi-Agent Deep Reinforcement Learning

Yang Li,Sihan Zhou,Yaqing Hou,Liran Zhou,Hongwei Ge,Liang Feng,Siyu Wang
DOI: https://doi.org/10.1109/ijcnn60899.2024.10650452
2024-01-01
Abstract:Advising is an effective method to enhance agent learning performance in multi-agent deep reinforcement learning. Existing advising methods typically rely on a teacher-student framework where a teacher agent provides student agents with action or Q-value advice. However, they share a common limitation: the advice from a teacher agent can only assist a student in making a one-time decision in the current state and cannot be internalized into the student agent’s knowledge to intrinsically change the student agent’s decision model. Consequently, the advice acts more like a one-time instruction from the teacher rather than a learning aid. If the student agent encounters the same problem again, it may still be unable to make a sound decision and need to request advice. This not only fails to rapidly enhance the agent’s decision-making ability fundamentally but also leads to a considerable waste of communication costs. Hence, we propose a multi-agent advice distillation framework through attention that allows the student agent to request advice from the experienced teacher and distill that advice into their own decision model via the self-attention mechanism. As a result, advice is fully utilized, allowing for a rapid and intrinsic improvement in the agent’s decision-making capabilities. Our empirical evaluations demonstrate that, compared to existing advising methods, our method significantly improves learning performance while reducing the communication cost.
What problem does this paper attempt to address?