Abstract:Hierarchical model-based reinforcement learning (HMBRL) aims to combine the benefits of better sample efficiency of model based reinforcement learning (MBRL) with the abstraction capability of hierarchical reinforcement learning (HRL) to solve complex tasks efficiently. While HMBRL has great potential, it still lacks wide adoption. In this work we describe a novel HMBRL framework and evaluate it thoroughly. To complement the multi-layered decision making idiom characteristic for HRL, we construct hierarchical world models that simulate environment dynamics at various levels of temporal abstraction. These models are used to train a stack of agents that communicate in a top-down manner by proposing goals to their subordinate agents. A significant focus of this study is the exploration of a static and environment agnostic temporal abstraction, which allows concurrent training of models and agents throughout the hierarchy. Unlike most goal-conditioned H(MB)RL approaches, it also leads to comparatively low dimensional abstract actions. Although our HMBRL approach did not outperform traditional methods in terms of final episode returns, it successfully facilitated decision making across two levels of abstraction using compact, low dimensional abstract actions. A central challenge in enhancing our method's performance, as uncovered through comprehensive experimentation, is model exploitation on the abstract level of our world model stack. We provide an in depth examination of this issue, discussing its implications for the field and suggesting directions for future research to overcome this challenge. By sharing these findings, we aim to contribute to the broader discourse on refining HMBRL methodologies and to assist in the development of more effective autonomous learning systems for complex decision-making environments.

Hierarchical Prototypes for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning

A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning

Prototypical context-aware dynamics generalization for high-dimensional model-based reinforcement learning

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

Learning Hierarchical Behavior and Motion Planning for Autonomous Driving.

Enhancing the Hierarchical Environment Design via Generative Trajectory Modeling

Look Before You Leap: Safe Model-Based Reinforcement Learning with Human Intervention

Curiosity & Entropy Driven Unsupervised RL in Multiple Environments

A Goal-Conditioned Reinforcement Learning Algorithm with Environment Modeling

Learning Parsimonious Dynamics for Generalization in Reinforcement Learning

Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL

Learning Latent Dynamic Robust Representations for World Models

Dynamics-Aware Unsupervised Discovery of Skills

Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning

A Behavior-Aware Approach for Deep Reinforcement Learning in Non-stationary Environments without Known Change Points

Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning

ED2: Environment Dynamics Decomposition World Models for Continuous Control

Exploring the limits of Hierarchical World Models in Reinforcement Learning

LEADS: Learning Dynamical Systems that Generalize Across Environments

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning

Improving Generalization in Reinforcement Learning Training Regimes for Social Robot Navigation