An Average-Reward Reinforcement Learning Algorithm Based on Schweitzer'S Transformation

Li Jianjun,Ren Jiangong,Li Yanjie
2012-01-01
Abstract:In this paper, we propose a relative value iteration reinforcement learning (RVI-RL) algorithm based on Schweitzer's Transformation for Markov decision processes (MDP) with average reward. An equivalent average reward optimality equation and a new form of action-value function are presented via Schweitzer's Transformation. Then, combined with the theory of relative value iteration, this RVI-RL algorithm doesn't only omit the estimation of the average reward in the learning, but also improves the convergence rate. Finally, a simulation experiment for the navigation of autonomous mobile robot is considered, which illustrates the effectiveness and applicability of the algorithm.
What problem does this paper attempt to address?