Hierarchical reinforcement learning for handling sparse rewards in multi-goal navigation

Jiangyue Yan,Biao Luo,Xiaodong Xu
DOI: https://doi.org/10.1007/s10462-024-10794-3
IF: 9.588
2024-05-29
Artificial Intelligence Review
Abstract:Reinforcement learning (RL) has achieved remarkable advancements in navigation tasks in recent years. However, tackling multi-goal navigation tasks with sparse rewards remains a complex and challenging problem due to the long-sequence decision-making involved. Such multi-goal navigation tasks inherently incorporate a hybrid action space, where the robot needs to select a navigation endpoint first before executing primitive actions. To address the problem of multi-goal navigation with sparse rewards, we introduce a novel hierarchical RL framework named Hierarchical RL with Multi-Goal (HRL-MG). The main idea of HRL-MG is to divide and conquer the hybrid action space, splitting long-sequence decisions into short-sequence decisions. The HRL-MG framework is composed of two main modules: a selector and an actuator. The selector employs a temporal abstraction hierarchical architecture designed to specify a desired end goal based on the discrete action space. Conversely, the actuator utilizes a continuous goal-oriented hierarchical architecture developed to enact continuous action sequences to reach the desired end goal specified by the selector. In addition, we incorporate a dynamic goal detection mechanism, grounded in hindsight experience replay, to mitigate the challenges posed by sparse reward landscapes. We validated the algorithm's efficacy on both the discrete environment Maze_2D and the continuous robotic environment MuJoCo 'Ant'. The results indicate that HRL-MG significantly outperforms other methods in multi-goal navigation tasks with sparse rewards.
computer science, artificial intelligence
What problem does this paper attempt to address?
The paper primarily proposes a new solution to the sparse reward problem in multi-goal navigation tasks. Specifically, the paper addresses the following issues: 1. **Background and Challenges**: - Current Reinforcement Learning (RL) methods have made significant progress in single-goal navigation tasks but face challenges in multi-goal navigation tasks, especially in sparse reward environments. - Multi-goal navigation tasks require the robot to select a final destination (discrete action space) and then execute primitive actions (continuous action space), introducing the problem of hybrid action space. - Existing reward design methods, such as distance rewards, can lead to suboptimal solutions in multi-goal tasks. 2. **Proposed Method**: - The paper proposes a novel hierarchical reinforcement learning framework named Hierarchical RL with Multi-Goal (HRL-MG). - The HRL-MG framework consists of two parts: Selector and Actuator. - **Selector**: Responsible for selecting the current goal (i.e., the desired destination) from multiple goals, using a discrete action space. - **Actuator**: Executes primitive actions through a continuous action space to reach the goal provided by the Selector. - To overcome the challenges posed by sparse rewards, the paper also introduces a dynamic goal detection mechanism based on Hindsight Experience Replay (HER). 3. **Summary of Contributions**: - A novel hierarchical framework is proposed, combining temporal abstraction architecture and continuous goal-oriented architecture to address the hybrid action space problem in multi-goal tasks. - A HER-based dynamic goal detection method is introduced in both the Selector and Actuator, further alleviating the training difficulties in sparse reward environments. 4. **Experimental Validation**: - The paper conducts experimental validation in two types of environments: discrete environment Maze_2D and continuous environment MuJoCo 'Ant'. - Experimental results show that the HRL-MG algorithm significantly outperforms other methods in handling multi-goal navigation tasks, especially under sparse reward conditions. In summary, this paper aims to solve the sparse reward problem in multi-goal navigation tasks through a new hierarchical reinforcement learning method and demonstrates its effectiveness and superiority through experiments.