SAVE: Spatial-Attention Visual Exploration.

Xinyi Yang,Chao Yu,Jiaxuan Gao,Yu Wang,Huazhong Yang
DOI: https://doi.org/10.1109/icip46576.2022.9897714
2022-01-01
Abstract:Visual indoor exploration requires agents to explore a room in a limited time. Currently, planning-based solutions have a time-consuming inference stage and require many handcrafted parameters in different scenes. Reinforcement Learning (RL) schemes on the other hand solve these problems by automatically updating flexible policies and affording faster inference time. Spurred by the advantages of RL, we introduce Spatial Attention Visual Exploration (SAVE), which is based on Active Neural SLAM (ANS) [1]. Specifically, we propose a novel RL-based global planner named Spatial Global Policy (SGP) that utilizes spatial information to promote efficient exploration through global goal guidance. SGP has two major components: a transformer-based spatial-attention module encoding spatial interrelation between the agent and different regions to perform spatial reasoning, and a hierarchical spatial action selector to infer global goals for faster training. The map representations are aligned through our spatial adjustor. Experiments on the Habitat photo-realistic simulator [2] demonstrate that SAVE outperforms current planning-based methods and RL variants, reducing at least 10% of the processing steps, 15% of the repeat ratio, and affording an x2 to x4 faster execution time than planning-based methods.
What problem does this paper attempt to address?