Agent Maze Path Planning Based on Simulated Annealing Q-Learning Algorithm

Zhongtian Mao,Zipeng Wu,Xiaohan Fang,Songsong Cheng,Yuan Fan
DOI: https://doi.org/10.23919/ccc55666.2022.9902817
2022-01-01
Abstract:The problem of path exploration and planning of agents in unknown environments is a popular application problem in the field of reinforcement learning. In this paper, we propose an improved reinforcement learning algorithm called the Q-Learning algorithm for adaptive exploration based on simulated annealing (AE-SAQL). We apply the algorithm to the agent path planning problem, improve the setting of the reward function and add the feedback information of the environment. By simulating the Metropolis criterion in the annealing algorithm and adding an adaptive adjustment mechanism, the agent fully explores the environment and makes full use of the environmental information, solving the exploration-utilization dilemma during the algorithm and finally enabling the agent to reach the target location safely. Compared with the standard Q-Learning algorithm and SARSA algorithm, AE-SAQL achieves better.
What problem does this paper attempt to address?