Abstract:To realize trajectory prediction, most previous methods adopt the parameter-based approach, which encodes all the seen past-future instance pairs into model parameters. However, in this way, the model parameters come from all seen instances, which means a huge amount of irrelevant seen instances might also involve in predicting the current situation, disturbing the performance. To provide a more explicit link between the current situation and the seen instances, we imitate the mechanism of retrospective memory in neuropsychology and propose MemoNet, an instance-based approach that predicts the movement intentions of agents by looking for similar scenarios in the training data. In MemoNet, we design a pair of memory banks to explicitly store representative instances in the training set, acting as prefrontal cortex in the neural system, and a trainable memory addresser to adaptively search a current situation with similar instances in the memory bank, acting like basal ganglia. During prediction, MemoNet recalls previous memory by using the memory addresser to index related instances in the memory bank. We further propose a two-step trajectory prediction system, where the first step is to leverage MemoNet to predict the destination and the second step is to fulfill the whole trajectory according to the predicted destinations. Experiments show that the proposed MemoNet improves the FDE by 20.3%/10.2%/28.3% from the previous best method on SDD/ETH-UCY/NBA datasets. Experiments also show that our MemoNet has the ability to trace back to specific instances during prediction, promoting more interpretability.

Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks

SnapMem: Snapshot-based 3D Scene Memory for Embodied Exploration and Reasoning

Spatially-Aware Transformer for Embodied Agents

Transformer Memory for Interactive Visual Navigation in Cluttered Environments

Transformer-based Working Memory for Multiagent Reinforcement Learning with Action Parsing

Recurrent Action Transformer with Memory

Memory-and-Anticipation Transformer for Online Action Understanding

HiMemFormer: Hierarchical Memory-Aware Transformer for Multi-Agent Action Anticipation

Scalable Spatial Memory for Scene Rendering and Navigation

Structured Scene Memory for Vision-Language Navigation

VME-Transformer: Enhancing Visual Memory Encoding for Navigation in Interactive Environments

3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning

A Global-Memory-Aware Transformer for Vision-and-Language Navigation

Think Before You Act: Decision Transformers with Working Memory

Frontier-enhanced Topological Memory with Improved Exploration Awareness for Embodied Visual Navigation

Long horizon episodic decision making for cognitively inspired robots

Learning a World Model With Multitimescale Memory Augmentation

Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents

Remember Intentions: Retrospective-Memory-based Trajectory Prediction

Planning from Imagination: Episodic Simulation and Episodic Memory for Vision-and-Language Navigation

Optimizing Agent Behavior over Long Time Scales by Transporting Value