Abstract:Covering option discovery has been developed to improve the exploration of reinforcement learning in single-agent scenarios, where only sparse reward signals are available. It aims to connect the most distant states identified through the Fiedler vector of the state transition graph. However, the approach cannot be directly extended to multiagent scenarios, since the joint state space grows exponentially with the number of agents, thus prohibiting efficient option computation. Existing research adopting options in multiagent scenarios still relies on single-agent algorithms and fails to directly discover joint options that can improve the connectivity of the joint state space. In this article, we propose a new algorithm to directly compute multiagent options with collaborative exploratory behaviors while still enjoying the ease of decomposition. Our key idea is to approximate the joint state space as the Kronecker product of individual agents' state spaces, based on which we can directly estimate the Fiedler vector of the joint state space using the Laplacian spectrum of individual agents' transition graphs. This decomposition enables us to efficiently construct multiagent joint options by encouraging agents to connect the subgoal joint states, which are corresponding to the minimum or maximum of the estimated joint Fiedler vector. Evaluation on multiagent collaborative tasks shows that our algorithm can successfully identify multiagent options and significantly outperforms prior works using single-agent options or no options, in terms of both faster exploration and higher cumulative rewards.

Option Automatic Generation in Hierarchical Reinforcement Learning

Autonomous Discovery and Creation of Options in Hierarchical Reinforcement Learning

An agent with a sense of direction for option discovery in hierarchical reinforcement learning

Adversarial Option-Aware Hierarchical Imitation Learning.

Hierarchical reinforcement learning with unlimited option scheduling for sparse rewards in continuous spaces

A Provably Efficient Option-Based Algorithm for both High-Level and Low-Level Learning

Multi-Level Discovery of Deep Options

Online Baum-Welch algorithm for Hierarchical Imitation Learning

Abstract Value Iteration for Hierarchical Reinforcement Learning

OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning

Hierarchical Reinforcement Learning via Advantage-Weighted Information Maximization

Hierarchical Average Reward Policy Gradient Algorithms

Automatic formation of the structure of abstract machines in hierarchical reinforcement learning with state clustering

Option-Based Hierarchical Reinforcement Learning for UAV Multi-Objective Path Planning

Unveiling Options with Neural Decomposition

Hierarchical Reinforcement Learning Algorithm Based on Structural State-Space

Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning

Learning Multiagent Options for Tabular Reinforcement Learning Using Factor Graphs

Efficient Hierarchical Exploration with an Active Subgoal Generation Strategy.

Connect-Based Subgoal Discovery for Options in Hierarchical Reinforcement Learning

Hierarchical Planning and Learning for Robots in Stochastic Settings Using Zero-Shot Option Invention