Abstract:Autonomous navigation substantially depends on memory mechanisms for enhancing path optimality. However, the dynamic nature of environments can cause inconsistencies between the stored scene memory and real-time perception, leading to potentially catastrophic navigation errors. Existing memory structures often fall short in addressing these challenges, as they typically only account for object-level dynamics and falter when faced with long-term alterations in environmental structure. To counter these constraints, this paper presents a learning-based framework designed to construct and maintain a geometric memory, thereby facilitating improved visuomotor navigation under structural dynamics. This proposed framework employs visual inputs to construct a geometric representation of the environment, and identifies structural changes by assessing the consistency between the established memory and instantaneous perception. To update the geometric memory efficiently, we introduce a memory updater grounded in a structure storage pool. Furthermore, a two-phase hierarchical planner is proposed to decompose navigation tasks and formulate smooth navigation strategies. Experimental results from photorealistic simulations underscore the efficacy of the proposed system in managing long-term dynamics and navigation control. The effectiveness of the proposed system is further corroborated through deployment and testing in real-world environments. Note to Practitioners—Classic geometry-based navigation methods hinge on the construction of a global map for spatial reasoning and optimized robot control. Within a learning-based pipeline, the dependence on memory information becomes critical for enabling robots to develop spatial and temporal awareness. While the maintenance of memory structures can significantly expand the field of the robot’s perception in time and space, discrepancies between historical memory and real-time perception in dynamic environments can lead to misguided decisions. Unlike most existing research that addresses short-term dynamic issues at the object level, this paper centers on long-term, large-scale dynamics precipitated by changes in environmental structure. We put forward a learning-based framework that incorporates dynamic perception, map maintenance, and hierarchical navigation. The experimental results highlight the efficiency and real-time processing capability of this method in handling structural dynamics, hence enhancing navigation efficiency.

Graph Attention Memory for Visual Navigation.

DGMem: Learning Visual Navigation Policy Without Any Labels by Dynamic Graph Memory

A Novel Neural Multi-Store Memory Network for Autonomous Visual Navigation in Unknown Environment

Spiking Reinforcement Learning with Memory Ability for Mapless Navigation

MemoNav: Working Memory Model for Visual Navigation

Learning multimodal adaptive relation graph and action boost memory for visual navigation

MemoNav: Selecting Informative Memories for Visual Navigation

Frontier-enhanced Topological Memory with Improved Exploration Awareness for Embodied Visual Navigation

Cognitive Navigation for Intelligent Mobile Robots: A Learning-Based Approach with Topological Memory Configuration

A Navigation Cognitive System Driven by Hierarchical Spiking Neural Network.

Online Geometric Memory Generation and Maintenance for Visuomotor Navigation in Structural Dynamic Environments

Hierarchical Representations and Explicit Memory: Learning Effective Navigation Policies on 3D Scene Graphs using Graph Neural Networks

Learning Crowd Behaviors in Navigation with Attention-based Spatial-Temporal Graphs

GridMM: Grid Memory Map for Vision-and-Language Navigation

Improving Target-driven Visual Navigation with Attention on 3D Spatial Relationships

Building Category Graphs Representation with Spatial and Temporal Attention for Visual Navigation

Sparse Graphical Memory for Robust Planning

Visual Navigation with Multiple Goals Based on Deep Reinforcement Learning

Memory Proxy Maps for Visual Navigation

Topological Semantic Graph Memory for Image-Goal Navigation

MGRL: Graph neural network based inference in a Markov network with reinforcement learning for visual navigation