Abstract:Reinforcement learning (RL) has shown remarkable success in navigating complex robotic and gaming landscapes. However, achieving such results often requires a substantial number of interaction episodes between the agent and its environment, especially in scenarios with sparse and long-term rewards. Although expert demonstrations and hierarchical structures can enhance sample efficiency of RL, the inclusion of noise in expert demonstrations may lead to performance degradation. Here we address this challenge by introducing a novel measurement, the noise elimination factor with reachable coverage, to quantify the noise in trajectory demonstrations. We propose a filtering method based on this measure, which effectively eliminates noise that deviates from the main demonstration clusters and mitigates the adverse impact of imperfect demonstrations, particularly in hierarchical reinforcement learning. To optimize the utilization of filtered demonstrations, we further eliminate similar and redundant instances, constructing a concise and semantically clear demonstration set for subgoal graph construction. This culminates in the development of a Reachable Coverage-based Hierarchical Reinforcement Learning method (RCHRL). Experimental validation in complex robot control tasks and Maze environments demonstrates the efficacy of our approach in removing demonstration noises, surpassing recent state-of-the-art demonstration-guided reinforcement learning methods in terms of both asymptotic performance and stability. Our code is available on https://github.com/YuTang06/RCHRL.

Hierarchical Reinforcement Learning from Imperfect Demonstrations Through Reachable Coverage-Based Subgoal Filtering

Hierarchical Reinforcement Learning from Demonstration via Reachability-Based Reward Shaping

Reinforcement Learning with Supervision from Noisy Demonstrations

Demonstration actor critic

Exploration-efficient Deep Reinforcement Learning with Demonstration Guidance for Robot Control

Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration

A reinforcement learning algorithm acquires demonstration from the training agent by dividing the task space

Hierarchical reinforcement learning for handling sparse rewards in multi-goal navigation

Boosting Reinforcement Learning via Hierarchical Game Playing With State Relay

Learning Representations in Model-Free Hierarchical Reinforcement Learning

Overcoming Exploration in Reinforcement Learning with Demonstrations

Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning

Hierarchical reinforcement learning with natural language subgoals

Towards Sample-efficient Apprenticeship Learning from Suboptimal Demonstration

Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards

Demonstration Guided Actor-Critic Deep Reinforcement Learning for Fast Teaching of Robots in Dynamic Environments

Residual Reinforcement Learning from Demonstrations

Data-Efficient Hierarchical Reinforcement Learning for Robotic Assembly Control Applications

Advances in Hierarchical Reinforcement Learning

DEFENDER: DTW-Based Episode Filtering Using Demonstrations for Enhancing RL Safety

Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations