Abstract:Prioritized experience replay (PER) is an important technique in deep reinforcement learning (DRL). It improves the sampling efficiency of data in various DRL algorithms and achieves great performance. PER uses temporal difference error (TD-error) to measure the value of experiences and adjusts the sampling probability of experiences. Although PER can sample valuable experiences according to the TD-error, freshness is also an important character of experiences. It implicitly reflects the potential value of experiences. Fresh experiences are produced by virtue of the current networks and they are more valuable for updating the current network parameters than the past. The sampling of fresh experiences to train the neural networks can increase the learning speed of the agent, but few algorithms can perform this job efficiently. To solve this issue, a novel experience replay method is proposed in this paper. We first define that the experience freshness is negatively correlated with the number of replays. A new hyper-parameter, the freshness discounted factor μ, is introduced in PER to measure the experience freshness. Further, a novel experience replacement strategy in the replay buffer is proposed to increase the experience replacement efficiency. In our method, the sampling probability of fresh experiences is increased by raising its priority properly. So the algorithm is more likely to choose fresh experiences to train the neural networks during the learning process. We evaluated this method in both discrete control tasks and continuous control tasks via OpenAI Gym. The experimental results show that our method achieves better performance in both modes of operation.

High-Value Prioritized Experience Replay For Off-Policy Reinforcement Learning

Prioritized experience replay based on dynamics priority

Leveraging Efficiency Through Hybrid Prioritized Experience Replay in Door Environment.

Directly Attention Loss Adjusted Prioritized Experience Replay.

Attention Loss Adjusted Prioritized Experience Replay

Prioritized Experience Replay

Ddper - Decentralized Distributed Prioritized Experience Replay.

Actor Prioritized Experience Replay

Investigating the Interplay of Prioritized Replay and Generalization

Prioritized Experience Replay in Multi-Actor-Attention-Critic for Reinforcement Learning

Fresher Experience Plays a More Important Role in Prioritized Experience Replay

Balanced Prioritized Experience Replay in Off-Policy Reinforcement Learning

ROER: Regularized Optimal Experience Replay

Advances in Experience Replay

Z-Score Experience Replay in Off-Policy Deep Reinforcement Learning

Revisiting Prioritized Experience Replay: A Value Perspective

Prioritised Experience Replay Based on Sample Optimisation

Efficient Diversity-based Experience Replay for Deep Reinforcement Learning

Associative Memory Based Experience Replay for Deep Reinforcement Learning

Prioritized Experience Replay for Multi-agent Cooperation

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning