Abstract:Prioritized experience replay (PER) is an important technique in deep reinforcement learning (DRL). It improves the sampling efficiency of data in various DRL algorithms and achieves great performance. PER uses temporal difference error (TD-error) to measure the value of experiences and adjusts the sampling probability of experiences. Although PER can sample valuable experiences according to the TD-error, freshness is also an important character of experiences. It implicitly reflects the potential value of experiences. Fresh experiences are produced by virtue of the current networks and they are more valuable for updating the current network parameters than the past. The sampling of fresh experiences to train the neural networks can increase the learning speed of the agent, but few algorithms can perform this job efficiently. To solve this issue, a novel experience replay method is proposed in this paper. We first define that the experience freshness is negatively correlated with the number of replays. A new hyper-parameter, the freshness discounted factor μ, is introduced in PER to measure the experience freshness. Further, a novel experience replacement strategy in the replay buffer is proposed to increase the experience replacement efficiency. In our method, the sampling probability of fresh experiences is increased by raising its priority properly. So the algorithm is more likely to choose fresh experiences to train the neural networks during the learning process. We evaluated this method in both discrete control tasks and continuous control tasks via OpenAI Gym. The experimental results show that our method achieves better performance in both modes of operation.

Investigating the Interplay of Prioritized Replay and Generalization

Prioritized experience replay based on dynamics priority

High-Value Prioritized Experience Replay For Off-Policy Reinforcement Learning

Prioritized Generative Replay

Prioritized Experience Replay

Attention Loss Adjusted Prioritized Experience Replay

Fresher Experience Plays a More Important Role in Prioritized Experience Replay

Actor Prioritized Experience Replay

Leveraging Efficiency Through Hybrid Prioritized Experience Replay in Door Environment.

Directly Attention Loss Adjusted Prioritized Experience Replay.

ROER: Regularized Optimal Experience Replay

Advances in Experience Replay

Prioritized Experience Replay in Multi-Actor-Attention-Critic for Reinforcement Learning

Ddper - Decentralized Distributed Prioritized Experience Replay.

Balanced Prioritized Experience Replay in Off-Policy Reinforcement Learning

Understanding the effect of varying amounts of replay per step

Replay-enhanced Continual Reinforcement Learning

Revisiting Prioritized Experience Replay: A Value Perspective

Enhanced Generalization through Prioritization and Diversity in Self-Imitation Reinforcement Learning over Procedural Environments with Sparse Rewards

Prioritized Experience Replay for Multi-agent Cooperation

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning