Abstract:Hierarchical model-based reinforcement learning (HMBRL) aims to combine the benefits of better sample efficiency of model based reinforcement learning (MBRL) with the abstraction capability of hierarchical reinforcement learning (HRL) to solve complex tasks efficiently. While HMBRL has great potential, it still lacks wide adoption. In this work we describe a novel HMBRL framework and evaluate it thoroughly. To complement the multi-layered decision making idiom characteristic for HRL, we construct hierarchical world models that simulate environment dynamics at various levels of temporal abstraction. These models are used to train a stack of agents that communicate in a top-down manner by proposing goals to their subordinate agents. A significant focus of this study is the exploration of a static and environment agnostic temporal abstraction, which allows concurrent training of models and agents throughout the hierarchy. Unlike most goal-conditioned H(MB)RL approaches, it also leads to comparatively low dimensional abstract actions. Although our HMBRL approach did not outperform traditional methods in terms of final episode returns, it successfully facilitated decision making across two levels of abstraction using compact, low dimensional abstract actions. A central challenge in enhancing our method's performance, as uncovered through comprehensive experimentation, is model exploitation on the abstract level of our world model stack. We provide an in depth examination of this issue, discussing its implications for the field and suggesting directions for future research to overcome this challenge. By sharing these findings, we aim to contribute to the broader discourse on refining HMBRL methodologies and to assist in the development of more effective autonomous learning systems for complex decision-making environments.

Abstraction-Refinement for Hierarchical Probabilistic Models

A Lazy Abstraction Algorithm for Markov Decision Processes: Theory and Initial Evaluation

Reachability Analysis of Probabilistic Systems by Successive Refinements.

Probabilistic Program Abstractions

Unifying Safety Approaches for Stochastic Systems: From Barrier Functions to Uncertain Abstractions via Dynamic Programming

Learning World Models With Hierarchical Temporal Abstractions: A Probabilistic Perspective

Abstraction-based Probabilistic Stability Analysis of Polyhedral Probabilistic Hybrid Systems

Hybrid Recurrent Models Support Emergent Descriptions for Hierarchical Planning and Control

Certified Policy Verification and Synthesis for MDPs under Distributional Reach-avoidance Properties

Abstraction and Counterexample-Guided Refinement in Model Checking of Hybrid Systems.

Abstract Value Iteration for Hierarchical Reinforcement Learning

Hierarchy through Composition with Linearly Solvable Markov Decision Processes

Computing Abstraction Hierarchies by Numerical Simulation

Counterexample Explanation by Learning Small Strategies in Markov Decision Processes

Online Model-free Safety Verification for Markov Decision Processes Without Safety Violation

Exploring the limits of Hierarchical World Models in Reinforcement Learning

Verification Of Hybrid Systems Based On Counterexample-Guided Abstraction Refinement

What Are the Odds? Improving the foundations of Statistical Model Checking

Probabilistic unifying relations for modelling epistemic and aleatoric uncertainty: semantics and automated reasoning with theorem proving

Beyond Decisiveness of Infinite Markov Chains

Efficient Probabilistic Abstraction for SysML Activity Diagrams