HDPlanner: Advancing Autonomous Deployments in Unknown Environments through Hierarchical Decision Networks

Jingsong Liang,Yuhong Cao,Yixiao Ma,Hanqi Zhao,Guillaume Sartoretti
2024-08-07
Abstract:In this paper, we introduce HDPlanner, a deep reinforcement learning (DRL) based framework designed to tackle two core and challenging tasks for mobile robots: autonomous exploration and navigation, where the robot must optimize its trajectory adaptively to achieve the task objective through continuous interactions in unknown environments. Specifically, HDPlanner relies on novel hierarchical attention networks to empower the robot to reason about its belief across multiple spatial scales and sequence collaborative decisions, where our networks decompose long-term objectives into short-term informative task assignments and informative path plannings. We further propose a contrastive learning-based joint optimization to enhance the robustness of HDPlanner. We empirically demonstrate that HDPlanner significantly outperforms state-of-the-art conventional and learning-based baselines on an extensive set of simulations, including hundreds of test maps and large-scale, complex Gazebo environments. Notably, HDPlanner achieves real-time planning with travel distances reduced by up to 35.7% compared to exploration benchmarks and by up to 16.5% than navigation benchmarks. Furthermore, we validate our approach on hardware, where it generates high-quality, adaptive trajectories in both indoor and outdoor environments, highlighting its real-world applicability without additional training.
Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges faced by mobile robots during autonomous exploration and navigation in unknown environments. Specifically: 1. **Autonomous exploration**: The robot needs to find the shortest path to fully cover the unknown environment and classify it into free areas and occupied areas. 2. **Autonomous navigation**: The robot is assigned a far - away destination and reaches this destination as quickly as possible in the unknown environment. In these two tasks, the robot needs to continuously and adaptively optimize its trajectory according to its partial beliefs about the environment (i.e., accumulated observations). These tasks involve complex optimization processes. Especially in real - world deployments, the robot must strike a balance between exploring unknown areas and using existing beliefs to perform tasks. Meanwhile, online map updates also require the robot to replan paths in real - time to ensure the validity of the planned paths, which makes trajectory optimization more challenging. To address these challenges, the paper proposes HDPlanner, a framework based on deep reinforcement learning (DRL), aiming to optimize the robot's trajectory through continuous interaction to achieve task goals. HDPlanner introduces a novel hierarchical attention network, enabling the robot to reason about its beliefs on multiple spatial scales and decompose long - term goals into short - term information task assignments and path planning, thus completing tasks more efficiently. In addition, the paper also proposes a contrastive learning joint - optimization method to enhance the robustness of HDPlanner. Experimental results show that HDPlanner significantly outperforms existing traditional and learning - based baseline methods in multiple simulation environments, including hundreds of test maps and large - scale complex Gazebo environments. In particular, in actual hardware verification, HDPlanner generates high - quality adaptive trajectories suitable for indoor and outdoor environments, demonstrating its ability to be applied to the real world without additional training.