Policy Gradient Fuzzy Reinforcement Learning

Xuening Wang,Xin Xu,Hangen He
DOI: https://doi.org/10.1109/icmlc.2004.1382332
2004-01-01
Abstract:This paper presents a new approach for tuning conclusions of fuzzy rules based on reinforcement learning. Unlike the most of existing fuzzy reinforcement learning algorithms which are based on value function, while our approach called policy gradient fuzzy reinforcement learning (PGFRL) bases on gradient estimate. In PGFRL, The algorithm GPOMDP is employed to estimate the performance gradient with respect to the parameters of fuzzy rules. In our work we prove the convergence of fuzzy rules' parameters to a local optimum given necessary conditions. The experiment results show the effectiveness of PGFRL.
What problem does this paper attempt to address?