Goal-Guided Transformer-Enabled Reinforcement Learning for Efficient Autonomous Navigation

Wenhui Huang,Yanxin Zhou,Xiangkun He,Chen Lv
DOI: https://doi.org/10.1109/TITS.2023.3312453
2023-09-24
Abstract:Despite some successful applications of goal-driven navigation, existing deep reinforcement learning (DRL)-based approaches notoriously suffers from poor data efficiency issue. One of the reasons is that the goal information is decoupled from the perception module and directly introduced as a condition of decision-making, resulting in the goal-irrelevant features of the scene representation playing an adversary role during the learning process. In light of this, we present a novel Goal-guided Transformer-enabled reinforcement learning (GTRL) approach by considering the physical goal states as an input of the scene encoder for guiding the scene representation to couple with the goal information and realizing efficient autonomous navigation. More specifically, we propose a novel variant of the Vision Transformer as the backbone of the perception system, namely Goal-guided Transformer (GoT), and pre-train it with expert priors to boost the data efficiency. Subsequently, a reinforcement learning algorithm is instantiated for the decision-making system, taking the goal-oriented scene representation from the GoT as the input and generating decision commands. As a result, our approach motivates the scene representation to concentrate mainly on goal-relevant features, which substantially enhances the data efficiency of the DRL learning process, leading to superior navigation performance. Both simulation and real-world experimental results manifest the superiority of our approach in terms of data efficiency, performance, robustness, and sim-to-real generalization, compared with other state-of-the-art (SOTA) baselines. The demonstration video (<a class="link-external link-https" href="https://www.youtube.com/watch?v=aqJCHcsj4w0" rel="external noopener nofollow">this https URL</a>) and the source code (<a class="link-external link-https" href="https://github.com/OscarHuangWind/DRL-Transformer-SimtoReal-Navigation" rel="external noopener nofollow">this https URL</a>) are also provided.
Robotics,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the issue of data inefficiency in existing goal-driven navigation methods based on Deep Reinforcement Learning (DRL). Specifically, current methods typically decouple goal information from the perception module and directly introduce it as a decision condition, which leads to the negative impact of goal-irrelevant features in the scene representation during the learning process. To solve this problem, the authors propose a new method—Goal-Guided Transformer Reinforcement Learning (GTRL), which integrates the physical goal state as an input to the scene encoder, thereby combining scene representation with goal information to achieve efficient autonomous navigation. Additionally, the paper introduces an improved version of the Vision Transformer, called the Goal-Guided Transformer (GoT), which is pre-trained with expert priors to enhance data efficiency. This approach ensures that the scene representation focuses primarily on goal-related features, significantly improving the data efficiency of the DRL learning process and, consequently, enhancing navigation performance. Experimental results show that this method outperforms other state-of-the-art baseline methods in terms of data efficiency, performance, robustness, and generalization capability from simulation to the real world.