KnowRU: Knowledge Reusing via Knowledge Distillation in Multi-agent Reinforcement Learning

Zijian Gao,Kele Xu,Bo Ding,Huaimin Wang,Yiying Li,Hongda Jia
DOI: https://doi.org/10.3390/e23081043
2021-03-27
Abstract:Recently, deep Reinforcement Learning (RL) algorithms have achieved dramatically progress in the multi-agent area. However, training the increasingly complex tasks would be time-consuming and resources-exhausting. To alleviate this problem, efficient leveraging the historical experience is essential, which is under-explored in previous studies as most of the exiting methods may fail to achieve this goal in a continuously variational system due to their complicated design and environmental dynamics. In this paper, we propose a method, named "KnowRU" for knowledge reusing which can be easily deployed in the majority of the multi-agent reinforcement learning algorithms without complicated hand-coded design. We employ the knowledge distillation paradigm to transfer the knowledge among agents with the goal to accelerate the training phase for new tasks, while improving the asymptotic performance of agents. To empirically demonstrate the robustness and effectiveness of KnowRU, we perform extensive experiments on state-of-the-art multi-agent reinforcement learning (MARL) algorithms on collaborative and competitive scenarios. The results show that KnowRU can outperform the recently reported methods, which emphasizes the importance of the proposed knowledge reusing for MARL.
Artificial Intelligence,Multiagent Systems
What problem does this paper attempt to address?
The problem that this paper attempts to solve is in multi - agent reinforcement learning (MARL), how to effectively reuse historical experience to accelerate the training of new tasks and improve the asymptotic performance of agents in new tasks. Specifically, the paper proposes a knowledge - reuse method named "KnowRU", which, through the knowledge distillation (KD) technique, can be easily deployed in various MARL algorithms without the need for complex manual - coding designs, thereby achieving effective utilization of historical experience. ### Background of the Paper and Problem Definition With the significant progress of deep reinforcement learning (RL) algorithms in the multi - agent field, training increasingly complex tasks has become both time - consuming and resource - intensive. Effectively using historical experience is crucial for alleviating this problem. However, most of the existing methods are difficult to achieve this goal in continuously changing systems due to complex designs and dynamic environmental changes. Therefore, how to efficiently reuse historical knowledge in a multi - agent environment has become an urgent problem to be solved. ### Overview of the Solution The solution proposed in the paper is **KnowRU**, a knowledge - reuse method based on knowledge distillation. Its core idea is to accelerate the training process of new tasks and improve the asymptotic performance of agents by imitating the behavior of agents that performed well in previous tasks. The specific steps are as follows: 1. **Knowledge Distillation Framework**: Utilize the knowledge distillation technique to transfer the knowledge of agents trained in previous tasks to new agents. This includes achieving knowledge transfer by minimizing the output gap between the current agent and the historical agent. 2. **Task - Independence**: KnowRU is designed as a task - independent method and can be applied to various MARL algorithms without the need for complex adjustments for specific tasks. 3. **Two - Stage Training**: The training process is divided into two stages: - **Guiding Stage**: In the early stage, mainly rely on the guidance of historical agents to help the current agent quickly learn basic knowledge. - **Specialization Stage**: As the training progresses, gradually reduce the influence of historical agents and let the current agent gradually adapt to the specific requirements of the new task. ### Experimental Verification To verify the effectiveness of KnowRU, the paper conducted extensive experiments on multiple multi - agent reinforcement learning algorithms (such as MADDPG and MAAC). The experimental scenarios include cooperative tasks (such as simple diffusion tasks) and competitive tasks (such as simple adversarial tasks and cooperative treasure - collection tasks). The experimental results show that KnowRU performs well in the following aspects: - **Accelerate Training**: In the early stage of training, KnowRU can significantly improve the performance of agents and reduce the time required to reach a certain performance level. - **Improve Asymptotic Performance**: In complex tasks, KnowRU helps agents reach higher asymptotic performance and avoids premature convergence to sub - optimal solutions. - **Robustness**: In different task scenarios, KnowRU can stably improve performance, showing good robustness. ### Conclusion By proposing the KnowRU method, the paper has successfully solved the problem of effectively reusing historical experience in multi - agent reinforcement learning. This method not only accelerates the training process of new tasks but also improves the asymptotic performance of agents and has broad application prospects.