Undiscounted Reinforcement Learning Algorithm Based on Performance Potentials

ZHOU Ru-yi,GAO Yang
DOI: https://doi.org/10.3969/j.issn.1001-6600.2006.04.015
2006-01-01
Abstract:Traditional performance potential-based learning algorithms can obtain optimal policies in MDP problems.They mainly adopt single sample path based on methods which make them less efficient.In this paper,a new learning algorithm which utilizes performance potential and reinforcement learning is proposed.Compared with the classic R-learning algorithm,it has promising results.
What problem does this paper attempt to address?