Optimal path planning approach based on Q-learning algorithm for mobile robots

Abderraouf Maoudj,Abdelfetah Hentout
DOI: https://doi.org/10.1016/j.asoc.2020.106796
IF: 8.7
2020-12-01
Applied Soft Computing
Abstract:<p>In fact, optimizing path within short computation time still remains a major challenge for mobile robotics applications. In path planning and obstacles avoidance, <em>Q-Learning</em> (<em>QL</em>) algorithm has been widely used as a computational method of learning through environment interaction. However, less emphasis is placed on path optimization using <em>QL</em> because of its slow and weak convergence toward optimal solutions. Therefore, this paper proposes an <em>Efficient Q-Learning</em> (<em>EQL</em>) algorithm to overcome these limitations and ensure an optimal collision-free path in less possible time. In the <em>QL</em> algorithm, successful learning is closely dependent on the design of an effective reward function and an efficient selection strategy for an optimal action that ensures exploration and exploitation. In this regard, a new reward function is proposed to initialize the <em>Q-table</em> and provide the robot with prior knowledge of the environment, followed by a new efficient selection strategy proposal to accelerate the learning process through search space reduction while ensuring a rapid convergence toward an optimized solution. The main idea is to intensify research at each learning stage, around the straight-line segment linking the current position of the robot to <span class="math"><math>Target</math></span> (optimal path in terms of length). During the learning process, the proposed strategy favors promising actions that not only lead to an optimized path but also accelerate the convergence of the learning process. The proposed <em>EQL</em> algorithm is first validated using benchmarks from the literature, followed by a comparison with other existing <em>QL</em>-based algorithms. The achieved results showed that the proposed <em>EQL</em> gained good learning proficiency; besides, the training performance is significantly improved compared to the state-of-the-art. Concluded, <em>EQL</em> improves the quality of the paths in terms of length, computation time and robot safety, furthermore outperforms other optimization algorithms.</p>
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?