Multigoal Visual Navigation With Collision Avoidance via Deep Reinforcement Learning

Wendong Xiao,Liang Yuan,Li He,Teng Ran,Jianbo Zhang,Jianping Cui
DOI: https://doi.org/10.1109/tim.2022.3158384
IF: 5.6
2022-01-01
IEEE Transactions on Instrumentation and Measurement
Abstract:Learning to map the images acquired by a moving agent equipped with a camera sensor to motion commands for multigoal navigation is challenging. Most existing approaches are still struggling against collision avoidance, faster convergence, and generalization. In this article, a novel actor–critic architecture is presented to learn the optimal navigation policy. We introduce single-step reward observation and collision penalty to reshape the reinforcement learning (RL) reward function. The collision perception can be obtained by the reshaped reward function and treated as measurement information from the visual observation to avoid obstacles. Besides, expert trajectories are used to generate subgoals. A subgoal reward shaping is then proposed to accelerate policy learning with the expert knowledge of subgoals. In order to generate human-aware navigation policies, an observation-action consistency (OAC) model is introduced to ensure that the agent reaches the subgoals in turn, and moves toward the target. The whole training process is performed on a self-supervised RL approach, accompanied by an expert supervision signal. This method balances the exploration and exploitation, helping the proposed model to generalize to unseen goals. The training experiments on AI2-THOR show better performance and faster convergence speed, compared with the existing approaches. For the generalization capacity to unseen goals, the proposed method achieves the state-of-the-art success rate, with at least a 30% improvement of average episode collision.
engineering, electrical & electronic,instruments & instrumentation
What problem does this paper attempt to address?