HOGN-TVGN: Human-inspired Embodied Object Goal Navigation Based on Time-varying Knowledge Graph Inference Networks for Robots

Baojiang Yang,Xianfeng Yuan,Zhongmou Ying,Jialin Zhang,Boyi Song,Yong Song,Fengyu Zhou,Weihua Sheng
DOI: https://doi.org/10.1016/j.aei.2024.102671
IF: 8.8
2024-01-01
Advanced Engineering Informatics
Abstract:Object goal navigation tasks are critical for robots operating in unfamiliar environments, where they must locate specific objects using visual cues. The ability to leverage prior knowledge significantly enhances a robot’s associative capabilities, leading to improved navigation performance. However, existing methods struggle with the generalization challenge when transferring navigation models to new environments, a key issue addressed in this paper. To overcome this challenge, on the one hand, a time-varying knowledge graph is proposed to update the prior knowledge graph with context vectors derived from co-occurrence objects in the current observation. This approach prioritizes local graphs centered around the target and co-occurring objects, allowing for efficient and accurate target localization. Furthermore, the dynamic updating mechanism facilitates efficient exploration in new scenarios. On the other hand, to embed prior knowledge more rationally in the reinforcement learning-based navigation strategy, a time-varying knowledge graph inference network (TVGN) is presented. The TVGN utilizes context vectors and global spatial semantic information to perceive and understand the environment in real-time. It formulates navigation strategies based on the precise goal information encoded within the graph, thereby enhancing the robot’s efficiency in reaching the target. Based on the widely applied dataset AI2-THOR, extensive comparative experiments are conducted to illustrate the effectiveness of the proposed method. Experimental results indicate that our model outperforms state-of-the-art competitors, demonstrating notable advantages in navigation effectiveness and efficiency in previously unseen environments.
What problem does this paper attempt to address?