Abstract:Reinforcement Learning (RL), especially Deep Reinforcement Learning (DRL), has made great progress in many areas, such as robots, video games and driving. However, sample inefficiency is a big obstacle to the widespread practical application of DRL. Inspired by the decision making in human brain, this problem can be solved by incorporating instance based learning, i.e. episodic memory. Many episodic memory based RL algorithms have emerged recently. However, these algorithms either only replace parametric DRL algorithm with episodic control or incorporate episodic memory in a single component of DRL. In contrast to preview works, this paper proposes a new sample-efficient reinforcement learning architecture which introduces a new episodic memory module and incorporates episodic thought into some key components of DRL: exploration, experience replay and loss function. Taking Deep Q-Network (DQN) algorithm for example, when combined with DQN, our algorithm is called High Efficient Episodic Memory DQN (HE-EMDQN). In HE-EMDQN, a new non-parametric episodic memory module is introduced to help calculate the loss and modify the predicted value for exploration. For the sake of accelerating the sample learning in experience replay, an auxiliary small buffer called percentile best episode replay memory is designed to compose a mixed mini-batch. We show across the testing environments that our algorithm is significantly more powerful and sample-efficient than DQN and the recent episodic memory deep q-network (EMDQN). This work provides a new perspective for other RL algorithms to improve sample efficiency by utilising episodic memory efficiently.

Sequential memory improves sample and memory efficiency in Episodic Control

Episodic Reinforcement Learning with Associative Memory.

Towards sample-efficient episodic control with DAC-ML

Integrating Episodic Memory into a Reinforcement Learning Agent using Reservoir Sampling

Sample Efficient Reinforcement Learning Method Via High Efficient Episodic Memory.

Deep Reinforcement Learning with Parametric Episodic Memory

Dual Memory Model for Experience-Once Task-Incremental Lifelong Learning.

Continuous Episodic Control

State-based episodic memory for multi-agent reinforcement learning

Neural Episodic Control with State Abstraction

A hippocampus CA3 spiking neural network model for storage and retrieval of sequential memory

Episodic memory governs choices: An RNN-based reinforcement learning model for decision-making task

Episodic Memory Deep Q-Networks

Sequential Memory: a Putative Neural and Synaptic Dynamical Mechanism.

Memory-Efficient Episodic Control Reinforcement Learning with Dynamic Online k-means

Episodic Memory for Learning Subjective-Timescale Models

Efficient Replay Memory Architectures in Multi-Agent Reinforcement Learning for Traffic Congestion Control

Sample Efficient Reinforcement Learning Using Graph-Based Memory Reconstruction.

Episodic and associative memory from spatial scaffolds in the hippocampus

ELiSe: Efficient Learning of Sequences in Structured Recurrent Networks

Episodic Reinforcement Learning with Expanded State-reward Space