An Approach to Optimize Replay Buffer in Value-Based Reinforcement Learning.

Baicheng Chen,Tianhan Gao,Qingwei Mi
DOI: https://doi.org/10.1109/sose59841.2023.10178657
2023-01-01
Abstract:Reinforcement Learning (RL) has seen numerous advancements in recent years, particularly in the area of value-based algorithms. A key component of these algorithms is the Replay Buffer, which stores past experiences to improve learning. In this paper, the authors explore an optimization method for the Replay Buffer that increases the learning efficiency of an agent by prioritizing experiences based on their training value (T). The authors test the proposed approach in two environments, a maze and Cartpole-v1, comparing it to traditional Q-learning and Deep Q-Networks (DQN) algorithms. The results demonstrate improvements in learning efficiency and training effects, showing potential for the application of the method in various RL scenarios.
What problem does this paper attempt to address?