An Optimized Q-Learning Algorithm Based on the Thinking of Tabu Search

Xiaogang Zhang,Zhijing Liu
DOI: https://doi.org/10.1109/ISCID.2008.179
2008-01-01
Abstract:One core issue in reinforcement learning is the balance between exploration and exploitation. Pure exploitation makes the agent reach the partial optimal solution quickly. Exploration avoids the partial optimal solution but too much exploration will reduce the performance of the Q -learning algorithm. How to avoid the partial optimal solution and find the global optimum solution is one of key goals of action selection in Q-learning. In this paper, the thinking of tabu search algorithm is introduced in order to balance exploration and exploitation of Q-learning. The optimized algorithms called T-Q-learning is proved to have a faster convergence rate and avoid the partial optimal solution in the experiments.
What problem does this paper attempt to address?