Abstract:Deep reinforcement learning has achieved significant success in various domains. However, it still faces a huge challenge when learning multiple tasks in sequence. This is because the interaction in a complex setting involves continual learning that results in the change in data distributions over time. A continual learning system should ensure that the agent acquires new knowledge without forgetting the previous one. However, catastrophic forgetting may occur as the new experience can overwrite previous experience due to limited memory size. The dual experience replay algorithm which retains previous experience is widely applied to reduce forgetting, but it cannot be applied in scalable tasks when the memory size is constrained. To alleviate the constrained by the memory size, we propose a new continual reinforcement learning algorithm called Self-generated Long-term Experience Replay (SLER). Our method is different from the standard dual experience replay algorithm, which uses short-term experience replay to retain current task experience, and the long-term experience replay retains all past tasks’ experience to achieve continual learning. In this paper, we first trained an environment sample model called Experience Replay Mode (ERM) to generate the simulated state sequence of the previous tasks for knowledge retention. Then combined the ERM with the experience of the new task to generate the simulation experience all previous tasks to alleviate forgetting. Our method can effectively decrease the requirement of memory size in multiple tasks, reinforcement learning. We show that our method in StarCraft II and the GridWorld environments performs better than the state-of-the-art deep learning method and achieve a comparable result to the dual experience replay method, which retains the experience of all the tasks.

SEEKR: Selective Attention-Guided Knowledge Retention for Continual Learning of Large Language Models

Replay-enhanced Continual Reinforcement Learning

Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning

Watch Your Step: Optimal Retrieval for Continual Learning at Scale

Adaptive Memory Replay for Continual Learning

Effective Data Selection and Replay for Unsupervised Continual Learning

Selecting Related Knowledge via Efficient Channel Attention for Online Continual Learning

Continual Learning with Strong Experience Replay

CORE: Mitigating Catastrophic Forgetting in Continual Learning through Cognitive Replay

Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal

SLER: Self-generated long-term experience replay for continual reinforcement learning

Adaptive Rank, Reduced Forgetting: Knowledge Retention in Continual Learning Vision-Language Models with Dynamic Rank-Selective LoRA

Continual Learning via Manifold Expansion Replay

Saliency-Guided Hidden Associative Replay for Continual Learning

Towards Continual Knowledge Learning of Language Models

Coordinating Experience Replay: A Harmonious Experience Retention approach for Continual Learning

DESIRE: Dynamic Knowledge Consolidation for Rehearsal-Free Continual Learning

Improving Replay Sample Selection and Storage for Less Forgetting in Continual Learning

Relational Experience Replay: Continual Learning by Adaptively Tuning Task-wise Relationship

Principal Gradient Direction and Confidence Reservoir Sampling for Continual Learning

May the Forgetting Be with You: Alternate Replay for Learning with Noisy Labels