Vishnu D. Sharma,Jeongran Lee,Matthew Andrews,Ilija Hadžić
Abstract:Local planning is an optimization process within a mobile robot navigation stack that searches for the best velocity vector, given the robot and environment state. Depending on how the optimization criteria and constraints are defined, some planners may be better than others in specific situations. We consider two conceptually different planners. The first planner explores the velocity space in real-time and has superior path-tracking and motion smoothness performance. The second planner was trained using reinforcement learning methods to produce the best velocity based on its training $"$experience$"$. It is better at avoiding dynamic obstacles but at the expense of motion smoothness. We propose a simple yet effective meta-reasoning approach that takes advantage of both approaches by switching between planners based on the surroundings. We demonstrate the superiority of our hybrid planner, both qualitatively and quantitatively, over the individual planners on a live robot in different scenarios, achieving an improvement of 26% in the navigation time.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in mobile robot navigation, how to combine the advantages of classical planners and planners based on Reinforcement Learning (RL) to achieve more efficient and smoother local path planning. Specifically:
1. **Classical planners**: They perform well in known maps and can generate smooth motion trajectories, but are not sensitive enough in response to dynamic obstacles.
2. **Planners based on RL**: They perform excellently when dealing with dynamic obstacles, but will lead to unsmooth motion, especially in the absence of dynamic obstacles.
To solve these problems, the paper proposes a Hybrid Local Planner, which can intelligently choose which planner to use in different situations. Specifically:
- **When the path is obstacle - free**, use a classical planner (such as DWA) to ensure the smoothness and efficiency of motion.
- **When an obstacle is detected in the path**, switch to a planner based on RL (such as SACPlanner) to better avoid the obstacle.
In this way, the hybrid planner can effectively deal with obstacles in a dynamic environment while maintaining smooth motion, thereby improving navigation performance. Experimental results show that this hybrid method can significantly outperform a single planner in different scenarios, especially improving the navigation time by 26%.
### Formula Summary
Some key formulas mentioned in the paper are as follows:
- **Reward function \(R(s, a)\)**:
\[
R(s, a)=(d_{\text{old}} - d_{\text{new}})\cdot\begin{cases}
1 & \text{if }d_{\text{old}} - d_{\text{new}}\geq0 \\
2 & \text{else}
\end{cases}
+(|\theta_{\text{old}}| - |\theta_{\text{new}}|)\cdot\begin{cases}
1 & \text{if }|\theta_{\text{old}}| - |\theta_{\text{new}}| \geq0 \\
2 & \text{else}
\end{cases}
- R_{\text{max}}\cdot\begin{cases}
1 & \text{if collision} \\
0 & \text{else}
\end{cases}
+ R_{\text{max}}\cdot\begin{cases}
1 & \text{if }d_{\text{new}} = 0 \\
0 & \text{else}
\end{cases}
- G(s')
\]
where:
- \(d_{\text{old}}, \theta_{\text{old}}\) are the distance and azimuth to the next waypoint in the current state \(s\).
- \(d_{\text{new}}, \theta_{\text{new}}\) are the distance and azimuth to the next waypoint in the new state \(s'\) after performing action \(a\).
- \(R_{\text{max}}\) is the reward/penalty for reaching the waypoint or collision.
- \(G(s')\) is the product of a truncated Gaussian kernel and the occupancy grid in state \(s'\).
In this way, the paper proposes a simple and effective meta - reasoning method that can select the optimal planning strategy according to the actual situation in different environments, thereby improving the navigation performance of the robot.