Remember the Past for Better Future: Memory-Augmented Offline RL

Yue Zhang,Yaodong Yang,Zhenbo Lu,Wengang Zhou,Houqiang Li
DOI: https://doi.org/10.1109/ijcnn60899.2024.10651193
2024-01-01
Abstract:As a foundation of human intelligence, memory has been found to be critical for human attention and decision making. However, it is usually underutilized in current reinforcement learning literature, primarily serving as training data. Researchers have rarely noticed the use of memory in other perspectives. To explore the potential of memory architectures, we focus on the offline reinforcement learning setting, where a fixed memory buffer is provided, and propose a novel framework to exploit it. Specifically, an attention-based architecture is designed to adaptively utilize past memories in learned environment dynamic models, providing reliable references for the estimation of future states. Such memory-augmented environment dynamic models are then applied to boost the training of RL policies. While demonstrating superior empirical performance, our method is highly extendable to most of offline model-based RL algorithms without any change in the pipelines or theoretical conclusions.
What problem does this paper attempt to address?