Abstract:Deep reinforcement learning has achieved significant success in various domains. However, it still faces a huge challenge when learning multiple tasks in sequence. This is because the interaction in a complex setting involves continual learning that results in the change in data distributions over time. A continual learning system should ensure that the agent acquires new knowledge without forgetting the previous one. However, catastrophic forgetting may occur as the new experience can overwrite previous experience due to limited memory size. The dual experience replay algorithm which retains previous experience is widely applied to reduce forgetting, but it cannot be applied in scalable tasks when the memory size is constrained. To alleviate the constrained by the memory size, we propose a new continual reinforcement learning algorithm called Self-generated Long-term Experience Replay (SLER). Our method is different from the standard dual experience replay algorithm, which uses short-term experience replay to retain current task experience, and the long-term experience replay retains all past tasks’ experience to achieve continual learning. In this paper, we first trained an environment sample model called Experience Replay Mode (ERM) to generate the simulated state sequence of the previous tasks for knowledge retention. Then combined the ERM with the experience of the new task to generate the simulation experience all previous tasks to alleviate forgetting. Our method can effectively decrease the requirement of memory size in multiple tasks, reinforcement learning. We show that our method in StarCraft II and the GridWorld environments performs better than the state-of-the-art deep learning method and achieve a comparable result to the dual experience replay method, which retains the experience of all the tasks.

TD3 with Composite Forgetting Prioritized Experience Replay

Actor Prioritized Experience Replay

Prioritized Experience Replay in Multi-Actor-Attention-Critic for Reinforcement Learning

Prioritized experience replay based on dynamics priority

Replay-enhanced Continual Reinforcement Learning

Fresher Experience Plays a More Important Role in Prioritized Experience Replay

Leveraging Efficiency Through Hybrid Prioritized Experience Replay in Door Environment.

A priority experience replay actor-critic algorithm using self-attention mechanism for strategy optimization of discrete problems

Prioritized experience replay in path planning via multi-dimensional transition priority fusion

Advances in Experience Replay

Dual Memory Model for Experience-Once Task-Incremental Lifelong Learning.

Efficient Diversity-based Experience Replay for Deep Reinforcement Learning

Ddper - Decentralized Distributed Prioritized Experience Replay.

Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay

Cooperative multi-agent target searching: a deep reinforcement learning approach based on parallel hindsight experience replay

ROER: Regularized Optimal Experience Replay

Prioritized Experience Replay

Optimizing TD3 for 7-DOF Robotic Arm Grasping: Overcoming Suboptimality with Exploration-Enhanced Contrastive Learning

Consistent Experience Replay in High-Dimensional Continuous Control with Decayed Hindsights

SLER: Self-generated long-term experience replay for continual reinforcement learning

AccMER: Accelerating Multi-Agent Experience Replay with Cache Locality-aware Prioritization