Abstract:Hierarchical model-based reinforcement learning (HMBRL) aims to combine the benefits of better sample efficiency of model based reinforcement learning (MBRL) with the abstraction capability of hierarchical reinforcement learning (HRL) to solve complex tasks efficiently. While HMBRL has great potential, it still lacks wide adoption. In this work we describe a novel HMBRL framework and evaluate it thoroughly. To complement the multi-layered decision making idiom characteristic for HRL, we construct hierarchical world models that simulate environment dynamics at various levels of temporal abstraction. These models are used to train a stack of agents that communicate in a top-down manner by proposing goals to their subordinate agents. A significant focus of this study is the exploration of a static and environment agnostic temporal abstraction, which allows concurrent training of models and agents throughout the hierarchy. Unlike most goal-conditioned H(MB)RL approaches, it also leads to comparatively low dimensional abstract actions. Although our HMBRL approach did not outperform traditional methods in terms of final episode returns, it successfully facilitated decision making across two levels of abstraction using compact, low dimensional abstract actions. A central challenge in enhancing our method's performance, as uncovered through comprehensive experimentation, is model exploitation on the abstract level of our world model stack. We provide an in depth examination of this issue, discussing its implications for the field and suggesting directions for future research to overcome this challenge. By sharing these findings, we aim to contribute to the broader discourse on refining HMBRL methodologies and to assist in the development of more effective autonomous learning systems for complex decision-making environments.

Hierarchical Policy Learning is Sensitive to Goal Space Design

Learning Hierarchical Graph-Based Policy for Goal-Reaching in Unknown Environments

Goal Space Abstraction in Hierarchical Reinforcement Learning via Reachability Analysis

Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?

Goal Space Abstraction in Hierarchical Reinforcement Learning via Set-Based Reachability Analysis

Sub-policy Adaptation for Hierarchical Reinforcement Learning

On the benefits of pixel-based hierarchical policies for task generalization

Temporal-adaptive Hierarchical Reinforcement Learning

Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning

Probabilistic Subgoal Representations for Hierarchical Reinforcement learning

Hierarchical reinforcement learning with natural language subgoals

Spatially and Seamlessly Hierarchical Reinforcement Learning for State Space and Policy space in Autonomous Driving

Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement Learning

Adjacency Constraint for Efficient Hierarchical Reinforcement Learning

Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning

Exploring the limits of Hierarchical World Models in Reinforcement Learning

Active Hierarchical Imitation and Reinforcement Learning

Synthesis of Hierarchical Controllers Based on Deep Reinforcement Learning Policies

Hierarchical Orchestra of Policies