Abstract:Extensive studies have shown that many animals’ capability of forming spatial representations for self-localization, path planning, and navigation relies on the functionalities of place and head-direction (HD) cells in the hippocampus. Although there are numerous hippocampal modeling approaches, only a few span the wide functionalities ranging from processing raw sensory signals to planning and action generation. This paper presents a vision-based navigation system that involves generating place and HD cells through learning from visual images, building topological maps based on learned cell representations and performing navigation using hierarchical reinforcement learning. First, place and HD cells are trained from sequences of visual stimuli in an unsupervised learning fashion. A modified Slow Feature Analysis (SFA) algorithm is proposed to learn different cell types in an intentional way by restricting their learning to separate phases of the spatial exploration. Then, to extract the encoded metric information from these unsupervised learning representations, a self-organized learning algorithm is adopted to learn over the emerged cell activities and to generate topological maps that reveal the topology of the environment and information about a robot’s head direction, respectively. This enables the robot to perform self-localization and orientation detection based on the generated maps. Finally, goal-directed navigation is performed using reinforcement learning in continuous state spaces which are represented by the population activities of place cells. In particular, considering that the topological map provides a natural hierarchical representation of the environment, hierarchical reinforcement learning (HRL) is used to exploit this hierarchy to accelerate learning. The HRL works on different spatial scales, where a high-level policy learns to select subgoals and a low-level policy learns over primitive actions to specialize on the selected subgoals. Experimental results demonstrate that our system is able to navigate a robot to the desired position effectively, and the HRL shows a much better learning performance than the standard RL in solving our navigation tasks.

HSPNav: Hierarchical Scene Prior Learning for Visual Semantic Navigation Towards Real Settings

ChatNav: Leveraging LLM to Zero-shot Semantic Reasoning in Object Navigation

Visual Semantic Navigation using Scene Priors

Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation

StereoNavNet: Learning to Navigate using Stereo Cameras with Auxiliary Occupancy Voxels

A Navigation Cognitive System Driven by Hierarchical Spiking Neural Network.

Multi goals and multi scenes visual mapless navigation in indoor using meta-learning and scene priors

Multi-Agent Embodied Visual Semantic Navigation with Scene Prior Knowledge

Context vector-based visual mapless navigation in indoor using hierarchical semantic information and meta-learning

Socially Aware Object Goal Navigation with Heterogeneous Scene Representation Learning

Think Holistically, Act Down-to-Earth: A Semantic Navigation Strategy with Continuous Environmental Representation and Multi-step Forward Planning

Hierarchical Representations and Explicit Memory: Learning Effective Navigation Policies on 3D Scene Graphs using Graph Neural Networks

Object Goal Visual Navigation Using Semantic Spatial Relationships.

Learning Navigational Visual Representations with Semantic Map Supervision

Learning a Semantic Prior for Guided Navigation

Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding

Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation

Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied Navigation

Vision-Based Robot Navigation through Combining Unsupervised Learning and Hierarchical Reinforcement Learning

Neuro-Planner: A 3D Visual Navigation Method for MAV With Depth Camera Based on Neuromorphic Reinforcement Learning

SemNav-HRO: A Target-Driven Semantic Navigation Strategy with Human–robot–object Ternary Fusion