Subgoal-based Hierarchical Reinforcement Learning for Multi-Agent Collaboration

Cheng Xu,Changtian Zhang,Yuchen Shi,Ran Wang,Shihong Duan,Yadong Wan,Xiaotong Zhang
2024-08-21
Abstract:Recent advancements in reinforcement learning have made significant impacts across various domains, yet they often struggle in complex multi-agent environments due to issues like algorithm instability, low sampling efficiency, and the challenges of exploration and dimensionality explosion. Hierarchical reinforcement learning (HRL) offers a structured approach to decompose complex tasks into simpler sub-tasks, which is promising for multi-agent settings. This paper advances the field by introducing a hierarchical architecture that autonomously generates effective subgoals without explicit constraints, enhancing both flexibility and stability in training. We propose a dynamic goal generation strategy that adapts based on environmental changes. This method significantly improves the adaptability and sample efficiency of the learning process. Furthermore, we address the critical issue of credit assignment in multi-agent systems by synergizing our hierarchical architecture with a modified QMIX network, thus improving overall strategy coordination and efficiency. Comparative experiments with mainstream reinforcement learning algorithms demonstrate the superior convergence speed and performance of our approach in both single-agent and multi-agent environments, confirming its effectiveness and flexibility in complex scenarios. Our code is open-sourced at: \url{<a class="link-external link-https" href="https://github.com/SICC-Group/GMAH" rel="external noopener nofollow">this https URL</a>}.
Multiagent Systems,Robotics
What problem does this paper attempt to address?
The main problem this paper attempts to address is the challenges faced by reinforcement learning in multi-agent environments, including algorithm instability, low sample efficiency, exploration difficulty, and the curse of dimensionality. Specifically, the paper addresses these issues by introducing a hierarchical reinforcement learning (HRL) architecture based on sub-goals. This method can autonomously generate effective sub-goals without explicit constraints, thereby enhancing training flexibility and stability. Additionally, the paper proposes a dynamic goal generation strategy that can adjust according to environmental changes, further improving the adaptability and sample efficiency of the learning process. Moreover, by combining the hierarchical architecture with an improved QMIX network, the paper addresses the credit assignment problem in multi-agent systems, improving overall strategy coordination and efficiency. ### Main Contributions: 1. **Task Tree-Based Sub-Goal Generation Method**: Innovated the design of the sub-goal space to better meet the needs of low-level policies, simplified the design of intrinsic reward functions, and improved policy performance. 2. **Adaptive Sub-Goal Generation Strategy**: Proposed a method for dynamically adjusting sub-goals to cope with significant environmental feature changes, ensuring a more robust and efficient learning process. 3. **Goal Mixing Network Fine-Tuning**: Introduced a new mixing network to fine-tune high-level policies by training joint goal value functions with global rewards, extending the hierarchical framework to multi-agent environments and addressing complex issues such as dimensionality and reward distribution. These contributions enable the proposed GMAH method to perform excellently in multi-agent environments, validating the potential of hierarchical learning architectures in complex scenarios.