Abstract:In interactive e-learning environments such as Intelligent Tutoring Systems, pedagogical decisions can be made at different levels of granularity. In this work, we focus on making decisions at two levels: whole problems vs. single steps and explore three types of granularity: problem-level only (Prob-Only), step-level only (Step-Only) and both problem and step levels (Both). More specifically, for Prob-Only, our pedagogical agency decides whether the next problem should be a worked example (WE) or a problem-solving (PS). In WEs, students observe how the tutor solves a problem while in PSs students solve the problem themselves. For Step-Only, the agent decides whether to elicit the student's next solution step or to tell the step directly. Here the student and the tutor co-construct the solution and we refer to this type of task as collaborative problem-solving (CPS). For Both, the agency first decides whether the next problem should be a WE, a PS, or a CPS and based on the problem-level decision, the agent then makes step-level decisions on whether to elicit or tell each step. In a series of classroom studies, we compare the three types of granularity under random yet reasonable pedagogical decisions. Results showed that while Prob-Only may be less effective for High students, Step-Only may be less effective for Low ones, Both can be effective for both High and Low students. Motivated by these findings, we propose and apply an offline, off-policy Gaussian Processes based Hierarchical Reinforcement Learning (HRL) framework to induce a hierarchical pedagogical policy that makes adaptive, effective decisions at both the problem and step levels. In an empirical classroom study, our results showed that the HRL policy is significantly more effective than a Deep Q-Network (DQN) induced step-level policy and a random yet reasonable step-level baseline policy.

Relabeling and policy distillation of hierarchical reinforcement learning

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Data-Efficient Hierarchical Reinforcement Learning for Robotic Assembly Control Applications

Temporal-adaptive Hierarchical Reinforcement Learning

Algorithms for Batch Hierarchical Reinforcement Learning

Bidirectional-Reachable Hierarchical Reinforcement Learning with Mutually Responsive Policies

HRL2E: Hierarchical Reinforcement Learning with Low-level Ensemble

$\epsilon$-Invariant Hierarchical Reinforcement Learning for Building Generalizable Policy

Meta-Learning Integration in Hierarchical Reinforcement Learning for Advanced Task Complexity

Hierarchical Reinforcement Learning from Demonstration via Reachability-Based Reward Shaping

Reinforcement Learning via Auxiliary Task Distillation

Integrating human learning and reinforcement learning: A novel approach to agent training

Deep Reinforcement Learning from Hierarchical Preference Design

Goal-Conditioned Hierarchical Reinforcement Learning with High-Level Model Approximation.

On Credit Assignment in Hierarchical Reinforcement Learning

LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning

Sub-policy Adaptation for Hierarchical Reinforcement Learning

Leveraging Granularity: Hierarchical Reinforcement Learning for Pedagogical Policy Induction

Hierarchical Reinforcement Learning in Complex 3D Environments

HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem

MENTOR: Guiding Hierarchical Reinforcement Learning with Human Feedback and Dynamic Distance Constraint