An Improved Q-learning Algorithm Based on Exploration Region Expansion Strategy

Qingji Gao,Bingrong Hong,Zhendong He,Jie Liu,Guochen Niu
DOI: https://doi.org/10.1109/WCICA.2006.1713159
2006-01-01
Abstract:In order to find a good solution to one of the key problems in Q-learning algorithm - keeping the balance between exploration and exploitation, an improved Q-learning algorithm based on exploration region expansion strategy is proposed on the base of Metropolis criterion-based Q-learning. With this strategy, the exploration blindness in the entire environment is eliminated, and the learning efficiency is increased. Meanwhile, other feasible path is sought where agent encounters obstacles, which makes the implementation of the algorithm on real robot easy. An automatic termination condition is also put forward, therefore, the redundant learning after finding optimal path is avoided, and the time of learning is reduced. The validity of the algorithm is proved by simulation experiments
What problem does this paper attempt to address?