Abstract:Multi-task multi-agent reinforcement learning aims to control multiple agents to perform well on multiple tasks. It encounters three core challenges: the varying number of agents and entities, the disparities in cooperative behaviors among different tasks, and the training imbalance caused by varying task difficulty levels. To address these issues, we propose a novel framework named Task-Entity Transformer Qmix (TETQmix), which employs pretrained language models for task encoding, utilizes proposed Task-Entity Transformer to handle observations across various tasks, and adjusts task learning weights to achieve balanced multi-task training. Task-Entity Transformer not only enables handling multi-task scenarios with varying numbers of agents and entities, but also leverages cross-attention modules to integrate observation and task embeddings, so that each agent can obtain individual values and decisions for multiple tasks. We then utilize a transformer-based mixer to monotonically combine the individual values, and train the whole network’s parameters using temporal-difference errors. To facilitate multi-task training, we define task regret as the difference between the current-stage return and the candidate best one, and adjust the learning weight of each task based on its task regret. Experiments are conducted on both simulated multi-particle environments and real-world multi-robot systems. Compared with existing baselines, our method not only is superior in multi-task learning efficiency, but also shows promising transfer ability on unseen tasks. Note to Practitioners —The flexibility of multi-agent systems makes them quite fit to multiple tasks. Compared to designing different decision models for different tasks, it is more convenient if one can use just one decision model to resolve multiple tasks. Besides, it can make the maximum utilization of trajectory data coming from similar tasks when the data are integrated for multi-task decision model training. Natural language provides a powerful tool to describe the task context and emphasize the similarities or differences among different tasks. Pretrained language models can encode the task context, based on which the decision model can adjust its output distribution for different tasks and even synthesize the decisions from existing and similar tasks to achieve promising zero-shot and few-shot transfer performance for unseen tasks. With our proposed TETQmix, practitioners are able to realize multi-task capability in multi-agent systems and increase the generalization in a variety of scenarios.

Multi-Task Reinforcement Learning with Attention-Based Mixture of Experts.

Leveraging the Efficiency of Multi-Task Robot Manipulation Via Task-Evoked Planner and Reinforcement Learning

Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts

Self-Supervised Mixture-of-Experts by Uncertainty Estimation

Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

Multi-Task Multi-Agent Reinforcement Learning with Task-Entity Transformers and Value Decomposition Training

Multi-task Batch Reinforcement Learning with Metric Learning

Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners

Efficient Multi-Task Reinforcement Learning via Task-Specific Action Correction

Multi-Task Reinforcement Learning with Soft Modularization.

TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Multi-Task Learning with Calibrated Mixture of Insightful Experts

Multi-agent Reinforcement Learning with Multi-head Attention

Multi-Task Multi-Agent Shared Layers are Universal Cognition of Multi-Agent Coordination

Attention-Driven Multi-Agent Reinforcement Learning: Enhancing Decisions with Expertise-Informed Tasks

Initial Task Allocation for Multi-Human Multi-Robot Teams with Attention-based Deep Reinforcement Learning

AC-MMOE: A Multi-gate Mixture-of-experts Model Based on Attention and Convolution

Efficient Multi-Task and Transfer Reinforcement Learning with Parameter-Compositional Framework

Contrastive Modules with Temporal Attention for Multi-Task Reinforcement Learning

Multi-Task Model Fusion with Mixture of Experts Structure