Discrete-Time Deterministic $Q$ -Learning: A Novel Convergence Analysis
Qinglai Wei,Frank L. Lewis,Qiuye Sun,Pengfei Yan,Ruizhuo Song
DOI: https://doi.org/10.1109/tcyb.2016.2542923
IF: 11.8
2017-01-01
IEEE Transactions on Cybernetics
Abstract:In this paper, a novel discrete-time deterministic $ Q$ -learning algorithm is developed. In each iteration of the developed $ Q$ -learning algorithm, the iterative $ Q$ function is updated for all the state and control spaces, instead of updating for a single state and a single control in traditional $ Q$ -learning algorithm. A new convergence criterion is established to guarantee that the iterative $ Q$ function converges to the optimum, where the convergence criterion of the learning rates for traditional $ Q$ -learning algorithms is simplified. During the convergence analysis, the upper and lower bounds of the iterative $ Q$ function are analyzed to obtain the convergence criterion, instead of analyzing the iterative $ Q$ function itself. For convenience of analysis, the convergence properties for undiscounted case of the deterministic $ Q$ -learning algorithm are first developed. Then, considering the discounted factor, the convergence criterion for the discounted case is established. Neural networks are used to approximate the iterative $ Q$ function and compute the iterative control law, respectively, for facilitating the implementation of the deterministic $ Q$ -learning algorithm. Finally, simulation results and comparisons are given to illustrate the performance of the developed algorithm.