TD3 with Composite Forgetting Prioritized Experience Replay

Zhi Wang
DOI: https://doi.org/10.1109/AIEA62095.2024.10692993
2024-06-14
Abstract:In recent years, the deep reinforcement learning method based on actor-critic has performed well in continuous action control tasks, such as Twin Delayed Deep Deterministic Policy Gradient (TD3). However, this algorithm also has shortcomings. For some complex environments, TD3 may struggle to converge due to reward sparsity issues. In such cases, we can improve the algorithm’s convergence by enhancing the utilization of experiences in the experience pool. By analyzing prioritized experience replay(PER) and selective experience replay(SER), we combine their advantages and propose a composite forgetting prioritized experience replay (CFPER) method. We integrate this method with TD3, resulting in the CFPER-TD3 algorithm, which effectively accelerates the convergence of TD3. The results show that this algorithm improves upon the original algorithm and achieves excellent performance in continuous action control tasks.
Computer Science
What problem does this paper attempt to address?