Abstract:The option framework, one of the most promising Hierarchical Reinforcement Learning (HRL) frameworks, is developed based on the Semi-Markov Decision Problem (SMDP) and employs a triple formulation of the option (i.e., an action policy, a termination probability, and an initiation set). These design choices, however, mean that the option framework: 1) has low sample efficiency, 2) cannot use more stable Markov Decision Problem (MDP) based learning algorithms, 3) represents abstract actions implicitly, and 4) is expensive to scale up. To overcome these problems, here we propose a simple yet effective MDP implementation of the option framework: the Skill-Action (SA) architecture. Derived from a novel discovery that the SMDP option framework has an MDP equivalence, SA hierarchically extracts skills (abstract actions) from primary actions and explicitly encodes these knowledge into skill context vectors (embedding vectors). Although SA is MDP formulated, skills can still be temporally extended by applying the attention mechanism to skill context vectors. Unlike the option framework, which requires M action policies for M skills, SA's action policy only needs one decoder to decode skill context vectors into primary actions. Under this formulation, SA can be optimized with any MDP based policy gradient algorithm. Moreover, it is sample efficient, cheap to scale up, and theoretically proven to have lower variance. Our empirical studies on challenging infinite horizon robot simulation environments demonstrate that SA not only outperforms all baselines by a large margin, but also exhibits smaller variance, faster convergence, and good interpretability. On transfer learning tasks, SA also outperforms the other models and shows its advantage on reusing knowledge across tasks. A potential impact of SA is to pave the way for a large scale pre-training architecture in the reinforcement learning area.

Hands-Free: Action Abstraction with Hierarchical Reinforcement Learning in Text-Based Games

Abstract then Play: A Skill-centric Reinforcement Learning Framework for Text-based Games.

Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based Games

A Minimal Approach for Natural Language Action Space in Text-based Games

An Analysis of Deep Reinforcement Learning Agents for Text-based Games

Generalization in Text-based Games via Hierarchical Reinforcement Learning

Perceiving the World: Question-guided Reinforcement Learning for Text-based Games

The Skill-Action Architecture: Learning Abstract Action Embeddings for Reinforcement Learning

Meta-Reinforcement Learning for Mastering Multiple Skills and Generalizing across Environments in Text-based Games

Language Understanding for Text-based Games Using Deep Reinforcement Learning

Hierarchical Decision Making by Generating and Following Natural Language Instructions

Intelligent Decision-Making and Human Language Communication Based on Deep Reinforcement Learning in a Wargame Environment

Using reinforcement learning to learn how to play text-based games

Object-Oriented State Abstraction in Reinforcement Learning for Video Games

Hierarchical reinforcement learning with natural language subgoals

Revisiting the Roles of "Text" in Text Games

Interactive Language Learning by Question Answering

Transfer in Deep Reinforcement Learning using Knowledge Graphs

You Only Look at Screens: Multimodal Chain-of-Action Agents

Learning to Play Text-based Adventure Games with Maximum Entropy Reinforcement Learning