Abstract:Model-based reinforcement learning (RL) is regarded as a promising approach to tackle the challenges that hinder model-free RL. The success of model-based RL hinges critically on the quality of the predicted dynamic models. However, for many real-world tasks involving high-dimensional state spaces, current dynamics prediction models show poor performance in long-term prediction. To that end, we propose a novel two-branch neural network architecture with multi-timescale memory augmentation to handle long-term and short-term memory differently. Specifically, we follow previous works to introduce a recurrent neural network architecture to encode history observation sequences into latent space, characterizing the long-term memory of agents. Different from previous works, we view the most recent observations as the short-term memory of agents and employ them to directly reconstruct the next frame to avoid compounding error. This is achieved by introducing a self-supervised optical flow prediction structure to model the action-conditional feature transformation at pixel level. The reconstructed observation is finally augmented by the long-term memory to ensure semantic consistency. Experimental results show that our approach is able to generate visually-realistic long-term predictions in DeepMind maze navigation games, and outperforms the prevalent state-of-the-art methods in prediction accuracy by a large margin. Furthermore, we also evaluate the usefulness of our world model by using the predicted frames to drive an imagination-augmented exploration strategy to improve the model-free RL controller.

Image Augmentation Based Momentum Memory Intrinsic Reward for Sparse Reward Visual Scenes

SnapMem: Snapshot-based 3D Scene Memory for Embodied Exploration and Reasoning

Learning a World Model With Multitimescale Memory Augmentation

Model-Based Reinforcement Learning Via Imagination with Derived Memory.

A Novel Neural Multi-Store Memory Network for Autonomous Visual Navigation in Unknown Environment

Incremental Model Enhancement Via Memory-based Contrastive Learning

Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning

Multimodal Reward Shaping for Efficient Exploration in Reinforcement Learning

Mnemonic Dictionary Learning for Intrinsic Motivation in Reinforcement Learning

MemoNav: Working Memory Model for Visual Navigation

Visionary: vision-aware enhancement with reminding scenes generated by captions via multimodal transformer for embodied referring expression

Learning Efficient Representation for Intrinsic Motivation

Tackling Visual Control via Multi-View Exploration Maximization

Deep Visual Odometry with Adaptive Memory

Planning from Imagination: Episodic Simulation and Episodic Memory for Vision-and-Language Navigation

Scalable Spatial Memory for Scene Rendering and Navigation

AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning

Sparse Graphical Memory for Robust Planning

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Go Beyond Imagination: Maximizing Episodic Reachability with World Models

Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models