Reinforcement Learning Method For Continuous State Space Based On Dynamic Neural Network

Wei Sun,Xuesong Wang,Yuhu Cheng
DOI: https://doi.org/10.1109/WCICA.2008.4594438
2008-01-01
Abstract:One of the difficulties encountered in the application of reinforcement learning methods to real-world problem is the generalization of large-scale or continuous state space In order to solve the curse of dimensionality problem caused by discretizing continuous state space, a kind of Q-learning method for continuous state space based on a dynamic Elman neural network was proposed in this paper. The inputs and the output of Elman network are the system state-action pair and the corresponding Q-value. That is, Elman network is used to estimate the Q-value of state-action pair on-line. Eligibility trace for connecting weights is introduced by borrowing ideas from the eligibility trace mechanism of state in Temporal Difference algorithm to improve the learning speed of neural network Computer simulations on mountain car control illustrate the performance and applicability of the proposed Q-learning scheme.
What problem does this paper attempt to address?