Abstract:Vision-and-language navigation requires an agent to navigate in a photo-realistic environment by following natural language instructions. Mainstream methods employ imitation learning (IL) to let the agent imitate the behavior of the teacher. The trained model will overfit the teacher's biased behavior, resulting in poor model generalization. Recently, researchers have sought to combine IL and reinforcement learning (RL) to overcome overfitting and enhance model generalization. However, these methods still face the problem of expensive trajectory annotation. We propose a hierarchical RL-based method-discovering intrinsic subgoals via hierarchical (DISH) RL-which overcomes the generalization limitations of current methods and gets rid of expensive label annotations. First, the high-level agent (manager) decomposes the complex navigation problem into simple intrinsic subgoals. Then, the low-level agent (worker) uses an intrinsic subgoal-driven attention mechanism for action prediction in a smaller state space. We place no constraints on the semantics that subgoals may convey, allowing the agent to autonomously learn intrinsic, more generalizable subgoals from navigation tasks. Furthermore, we design a novel history-aware discriminator (HAD) for the worker. The discriminator incorporates historical information into subgoal discrimination and provides the worker with additional intrinsic rewards to alleviate the reward sparsity. Without labeled actions, our method provides supervision for the worker in the form of self-supervision by generating subgoals from the manager. The final results of multiple comparison experiments on the Room-to-Room (R2R) dataset show that our DISH can significantly outperform the baseline in accuracy and efficiency.

Searching Latent Sub-Goals in Hierarchical Reinforcement Learning as Riemannian Manifold Optimization

HILONet: Hierarchical Imitation Learning from Non-Aligned Observations

Probabilistic Subgoal Representations for Hierarchical Reinforcement learning

Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs

Efficient Hierarchical Exploration with an Active Subgoal Generation Strategy.

Hierarchical reinforcement learning with natural language subgoals

Learning Subgoal Representations with Slow Dynamics

Efficient Exploration through Intrinsic Motivation Learning for Unsupervised Subgoal Discovery in Model-Free Hierarchical Reinforcement Learning

Active Hierarchical Exploration with Stable Subgoal Representation Learning

Subgoal-based Hierarchical Reinforcement Learning for Multi-Agent Collaboration

Hierarchical reinforcement learning for handling sparse rewards in multi-goal navigation

Discovering Intrinsic Subgoals for Vision-and-Language Navigation via Hierarchical Reinforcement Learning

Hierarchical Subtask Discovery With Non-Negative Matrix Factorization

Goal-Conditioned Hierarchical Reinforcement Learning with High-Level Model Approximation.

Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning

Deep Hierarchical Reinforcement Learning Based Recommendations via Multi-goals Abstraction

Connect-Based Subgoal Discovery for Options in Hierarchical Reinforcement Learning

MENTOR: Guiding Hierarchical Reinforcement Learning with Human Feedback and Dynamic Distance Constraint

Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning

Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals prediction

Goal Space Abstraction in Hierarchical Reinforcement Learning via Reachability Analysis