NavTr: Object-Goal Navigation with Learnable Transformer Queries

Qiuyu Mao,Jikai Wang,Meng Xu,Zonghai Chen
DOI: https://doi.org/10.1109/lra.2024.3497718
IF: 5.2
2024-01-01
IEEE Robotics and Automation Letters
Abstract:This paper introduces Nav igation Tr ansformer (NavTr), a novel framework for object-goal navigation using Transformer queries to enhance the learning and representation of environment states. By integrating semantic information, object positions, and neighborhood information, NavTr creates a unified, comprehensive, and extensible state representation for the object-goal navigating task. In the framework, the Transformer queries implicitly learn inter-object relationships, which facilitates high-level understanding of the environment. Additionally, NavTr implements target-oriented supervisory signals, such as rotation rewards and spatial loss, which improve exploration efficiency in the reinforcement learning framework. NavTr outperforms popular graph-based and Attention-based methods by a large margin in terms of success rate (SR) and success weighted by path length (SPL). Extensive experiments on the AI2-THOR dataset demonstrate the effectiveness of our approach.
What problem does this paper attempt to address?