Learning whom to trust in navigation: dynamically switching between classical and neural planning

Sombit Dey,Assem Sadek,Gianluca Monaci,Boris Chidlovskii,Christian Wolf
DOI: https://doi.org/10.48550/arXiv.2307.16710
2023-07-31
Abstract:Navigation of terrestrial robots is typically addressed either with localization and mapping (SLAM) followed by classical planning on the dynamically created maps, or by machine learning (ML), often through end-to-end training with reinforcement learning (RL) or imitation learning (IL). Recently, modular designs have achieved promising results, and hybrid algorithms that combine ML with classical planning have been proposed. Existing methods implement these combinations with hand-crafted functions, which cannot fully exploit the complementary nature of the policies and the complex regularities between scene structure and planning performance. Our work builds on the hypothesis that the strengths and weaknesses of neural planners and classical planners follow some regularities, which can be learned from training data, in particular from interactions. This is grounded on the assumption that, both, trained planners and the mapping algorithms underlying classical planning are subject to failure cases depending on the semantics of the scene and that this dependence is learnable: for instance, certain areas, objects or scene structures can be reconstructed easier than others. We propose a hierarchical method composed of a high-level planner dynamically switching between a classical and a neural planner. We fully train all neural policies in simulation and evaluate the method in both simulation and real experiments with a LoCoBot robot, showing significant gains in performance, in particular in the real environment. We also qualitatively conjecture on the nature of data regularities exploited by the high-level planner.
Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to combine the advantages of classical planning methods and neural network planning methods in robot navigation to improve navigation performance. Specifically, the paper explores the following two scientific questions: 1. **Are the trained planning strategies complementary to the classical planning strategies and perform well in different situations?** - The paper assumes that neural planners and classical planners each have their own advantages and disadvantages, which may be related to scene structure and semantics, and these rules can be learned from training data. 2. **Can these different types of scenes be clearly distinguished from visual observations so as to utilize these rules?** - The paper proposes a high - level planner based on reinforcement learning, which can dynamically select whether to use the neural planner or the classical planner according to the characteristics of the current scene. To verify these hypotheses, the paper proposes a hybrid method, which consists of three main components: - **Neural planner**: Trained by an end - to - end deep - learning method to directly predict navigation actions from visual inputs. - **Classical planner**: Performs path planning based on occupancy maps and is suitable for maps of known environments. - **High - level planner**: Trained by reinforcement learning, and dynamically selects whether to use the neural planner or the classical planner according to the characteristics of the current scene. The main contributions of the paper include: - **Hybrid method**: By dynamically switching between the two planning methods through the high - level planner, the navigation performance is improved. - **Large - scale training**: Conducts large - scale training using complex RGB - D inputs in a 3D photo - realistic simulation environment. - **Transfer from simulation to real environment**: Conducts extensive experiments in the real environment and shows significant performance improvement. Through this method, the paper aims to overcome the limitations of a single planning method in complex environments, especially in applications in real environments.