Abstract:Humans can leverage hierarchical structures to split a task into sub-tasks and solve problems efficiently. Both imitation and reinforcement learning or a combination of them with hierarchical structures have been proven to be an efficient way for robots to learn complex tasks with sparse rewards. However, in the previous work of hierarchical imitation and reinforcement learning, the tested environments are in relatively simple 2D games, and the action spaces are discrete. Furthermore, many imitation learning works focusing on improving the policies learned from the expert polices that are hard-coded or trained by reinforcement learning algorithms, rather than human experts. In the scenarios of human-robot interaction, humans can be required to provide demonstrations to teach the robot, so it is crucial to improve the learning efficiency to reduce expert efforts, and know human's perception about the learning/training process. In this project, we explored different imitation learning algorithms and designed active learning algorithms upon the hierarchical imitation and reinforcement learning framework we have developed. We performed an experiment where five participants were asked to guide a randomly initialized agent to a random goal in a maze. Our experimental results showed that using DAgger and reward-based active learning method can achieve better performance while saving more human efforts physically and mentally during the training process.

On the benefits of pixel-based hierarchical policies for task generalization

Learning Hierarchical Graph-Based Policy for Goal-Reaching in Unknown Environments

Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?

Hierarchical Policy Learning is Sensitive to Goal Space Design

Sub-policy Adaptation for Hierarchical Reinforcement Learning

Leveraging the Efficiency of Multi-Task Robot Manipulation Via Task-Evoked Planner and Reinforcement Learning

Hierarchical Potential-based Reward Shaping from Task Specifications

Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement Learning

Evolving hierarchical memory-prediction machines in multi-task reinforcement learning

Bidirectional-Reachable Hierarchical Reinforcement Learning with Mutually Responsive Policies

Hierarchical Visual Policy Learning for Long-Horizon Robot Manipulation in Densely Cluttered Scenes

Active Hierarchical Imitation and Reinforcement Learning

Hierarchical Reinforcement Learning Based on Continuous Subgoal Space

Hierarchical Skills for Efficient Exploration

Hierarchical Orchestra of Policies

Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning

Efficient Multi-Task Reinforcement Learning via Task-Specific Action Correction

Developing cooperative policies for multi-stage reinforcement learning tasks

Multi-Task Off-Policy Learning from Bandit Feedback

A Brain-Inspired Incremental Multi-task Reinforcement Learning Approach

Multi-Task Policy Search