Abstract:Multi-agent reinforcement learning shines as the pinnacle of multi-agent systems, conquering intricate real-world challenges, fostering collaboration and coordination among agents, and unleashing the potential for intelligent decision-making across domains. However, training a multi-agent reinforcement learning network is a formidable endeavor, demanding substantial computational resources to interact with diverse environmental variables, extract state representations, and acquire decision-making knowledge. The recent breakthroughs in large-scale pre-trained models ignite our curiosity: Can we uncover shared knowledge in multi-agent reinforcement learning and leverage pre-trained models to expedite training for future tasks? Addressing this issue, we present an innovative multi-task learning approach that aims to extract and harness common decision-making knowledge, like cooperation and competition, across different tasks. Our approach involves concurrent training of multiple multi-agent tasks, with each task employing independent front-end perception layers while sharing back-end decision-making layers. This effective decoupling of state representation extraction from decision-making allows for more efficient training and better transferability. To evaluate the efficacy of our proposed approach, we conduct comprehensive experiments in two distinct environments: the StarCraft Multi-agent Challenge (SMAC) and the Google Research Football (GRF) environments. The experimental results unequivocally demonstrate the smooth transferability of the shared decision-making network to other tasks, thereby significantly reducing training costs and improving final performance. Furthermore, visualizations authenticate the presence of general multi-agent decision-making knowledge within the shared network layers, further validating the effectiveness of our approach.

Q-SAT: Value Factorization with Self-Attention for Deep Multi-Agent Reinforcement Learning

Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients

Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization.

S2rl

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

DVF:Multi-agent Q-learning with difference value factorization

Oxidative need and oxidative capacity following traumatic brain injury.

Factorized Q-learning for Large-Scale Multi-Agent Systems

Qrelation: an Agent Relation-Based Approach for Multi-Agent Reinforcement Learning Value Function Factorization.

QFree: A Universal Value Function Factorization for Multi-Agent Reinforcement Learning

Multiagent Q-learning with Sub-Team Coordination.

AVD-Net: Attention Value Decomposition Network for Deep Multi-Agent Reinforcement Learning

Learning Multi-Agent Cooperation via Considering Actions of Teammates

Credit-of-Q-value for Multi-Agent Reinforcement Learning

Value function factorization with dynamic weighting for deep multi-agent reinforcement learning

On Stateful Value Factorization in Multi-Agent Reinforcement Learning

POWQMIX: Weighted Value Factorization with Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning

Decomposed Soft Actor-Critic Method for Cooperative Multi-Agent Reinforcement Learning

Boosting Value Decomposition Via Unit-Wise Attentive State Representation for Cooperative Multi-Agent Reinforcement Learning

Multi-Task Multi-Agent Shared Layers are Universal Cognition of Multi-Agent Coordination

Regularized Softmax Deep Multi-Agent Q-Learning.