Abstract:With the popularization of intelligence, the necessity of cooperation between intelligent machines makes the research of collaborative multi-agent reinforcement learning (MARL) more extensive. Existing approaches typically address this challenge through task decomposition of the environment or role classification of agents. However, these studies may rely on the sharing of parameters between agents, resulting in the homogeneity of agent behavior, which is not effective for complex tasks. Or training that relies on external rewards is difficult to adapt to scenarios with sparse rewards. Based on the above challenges, in this paper we propose a novel dynamic skill learning (DSL) framework for agents to learn more diverse abilities motivated by internal rewards. Specifically, the DSL has two components: (i) Dynamic skill discovery, which encourages the production of meaningful skills by exploring the environment in an unsupervised manner, using the inner product between a skill vector and a trajectory representation to generate intrinsic rewards. Meanwhile, the Lipschitz constraint of the state representation function is used to ensure the proper trajectory of the learned skills. (ii) Dynamic skill assignment, which utilizes a policy controller to assign skills to each agent based on its different trajectory latent variables. In addition, in order to avoid training instability caused by frequent changes in skill selection, we introduce a regularization term to limit skill switching between adjacent time steps. We thoroughly tested the DSL approach on two challenging benchmarks, StarCraft II and Google Research Football. Experimental results show that compared with strong benchmarks such as QMIX and RODE, DSL effectively improves performance and is more adaptable to difficult collaborative scenarios.

Skill Machines: Temporal Logic Skill Composition in Reinforcement Learning

GSC: A Graph-Based Skill Composition Framework for Robot Learning

Grounding Language for Robotic Manipulation via Skill Library

When Do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal Abstractions

Logic-Skill Programming: An Optimization-based Approach to Sequential Skill Planning

SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration

Adversarial Skill Chaining for Long-Horizon Robot Manipulation via Terminal State Regularization

Skill matters: Dynamic skill learning for multi-agent cooperative reinforcement learning

Learning Temporally Extended Skills in Continuous Domains as Symbolic Actions for Planning

Language-guided Skill Learning with Temporal Variational Inference

Agentic Skill Discovery

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

Deep Reinforcement Learning with Temporal Logics

Heterogeneous Skill Learning for Multi-agent Tasks

Unsupervised Skill Discovery for Robotic Manipulation through Automatic Task Generation

Skill Induction and Planning with Latent Language

Abstract then Play: A Skill-centric Reinforcement Learning Framework for Text-based Games.

Physically-Feasible Repair of Reactive, Linear Temporal Logic-based, High-Level Tasks

MaestroMotif: Skill Design from Artificial Intelligence Feedback

Creating Multi-Level Skill Hierarchies in Reinforcement Learning

LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon Manipulation