Deep reinforcement learning via good choice resampling experience replay memory

Xi-liang CHEN,Lei CAO,Chen-xi LI,Zhi-xiong XU,Ming HE
DOI: https://doi.org/10.13195/j.kzyjc.2017.0261
2018-01-01
Abstract:In order to build a good experience memory mechanism for deep reinforcement learning, a kind of resample choosing optimal memory cache construction method based on TD error is proposed. Ranking based algorithms on stratified sampling are also developed to avoid the collapse of training data set. Combined with this mechanism, several typical depth based onreinforcement learning algorithms based on DQN(deep Q-networks) are improved. Through the simulation on the control problem of Cart Port on Open AI Gym, experimental results show that the optimization mechanism improves the quality of training samples, and it can effectively enhance the learning value function, and has good learning efficiency and generalization performance. The convergence speed and training performance are improved significantly.
What problem does this paper attempt to address?