Abstract:Vision-language navigation (VLN) is a critical domain within embedded intelligence, requiring agents to navigate 3D environments based on natural language instructions. Traditional VLN research has focused on improving environmental understanding and decision accuracy. However, these approaches often exhibit a significant performance gap when agents are deployed in novel environments, mainly due to the limited diversity of training data. Expanding datasets to cover a broader range of environments is impractical and costly. We propose the Vision-Language Navigation with Continual Learning (VLNCL) paradigm to address this challenge. In this paradigm, agents incrementally learn new environments while retaining previously acquired knowledge. VLNCL enables agents to maintain an environmental memory and extract relevant knowledge, allowing rapid adaptation to new environments while preserving existing information. We introduce a novel dual-loop scenario replay method (Dual-SR) inspired by brain memory replay mechanisms integrated with VLN agents. This method facilitates consolidating past experiences and enhances generalization across new tasks. By utilizing a multi-scenario memory buffer, the agent efficiently organizes and replays task memories, thereby bolstering its ability to adapt quickly to new environments and mitigating catastrophic forgetting. Our work pioneers continual learning in VLN agents, introducing a novel experimental setup and evaluation metrics. We demonstrate the effectiveness of our approach through extensive evaluations and establish a benchmark for the VLNCL paradigm. Comparative experiments with existing continual learning and VLN methods show significant improvements, achieving state-of-the-art performance in continual learning ability and highlighting the potential of our approach in enabling rapid adaptation while preserving prior knowledge.

Predictive Experience Replay for Continual Visual Control and Forecasting

Continual Predictive Learning from Videos

Learning a World Model With Multitimescale Memory Augmentation

Continual Learning Using World Models for Pseudo-Rehearsal

Learning Latent Dynamic Robust Representations for World Models

Replay-enhanced Continual Reinforcement Learning

Augmenting Replay in World Models for Continual Reinforcement Learning

Stable Continual Reinforcement Learning via Diffusion-based Trajectory Replay

The Effectiveness of World Models for Continual Reinforcement Learning

SLER: Self-generated long-term experience replay for continual reinforcement learning

RePo: Resilient Model-Based Reinforcement Learning by Regularizing Posterior Predictability

Continual Pedestrian Trajectory Learning With Social Generative Replay

Consistent Experience Replay in High-Dimensional Continuous Control with Decayed Hindsights

Masked World Models for Visual Control

The Empirical Impact of Forgetting and Transfer in Continual Visual Odometry

ReCoRe: Regularized Contrastive Representation Learning of World Model

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning

Continual Learning via Manifold Expansion Replay

Representation Space Maintenance: Against Forgetting in Continual Learning

Adaptive Memory Replay for Continual Learning

Vision-Language Navigation with Continual Learning