Deep Reinforcement Learning with Parametric Episodic Memory

Kangkang Chen,Zhongxue Gan,Siyang Leng,Chun Guan
DOI: https://doi.org/10.1109/ijcnn55064.2022.9891902
2022-01-01
Abstract:Deep Reinforcement Learning methods are widely acknowledged to be sample inefficient, while incorporating episodic memory significantly improves it through rapidly latching onto successful experiences to guide the action of agents. Previous episodic methods, utilizing discrete memory, cannot well accommodate the continuous control tasks and have limited generalization ability to aggregate the experience across trajectories. We propose an improved episodic memory-based RL algorithm, combining the one-step method in off-policy algorithm with Parametric Episodic Memory (PEM), which leverages the discrete memory by neural networks, and thereby enhances both sample efficiency and generalization ability. Moreover, an adaptive k-nearest-neighbors is used in determining the volume of retrieved memory, further improving its efficiency. Our algorithm, evaluated on various MuJoCo continuous control tasks, outperforms the model-free baseline methods and latest episodic memory-based RL algorithms.
What problem does this paper attempt to address?