Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down Maps

Linfeng Zhao,Lawson L.S. Wong
2024-12-17
Abstract:Learning navigation capabilities in different environments has long been one of the major challenges in decision-making. In this work, we focus on zero-shot navigation ability using given abstract $2$-D top-down maps. Like human navigation by reading a paper map, the agent reads the map as an image when navigating in a novel layout, after learning to navigate on a set of training maps. We propose a model-based reinforcement learning approach for this multi-task learning problem, where it jointly learns a hypermodel that takes top-down maps as input and predicts the weights of the transition network. We use the DeepMind Lab environment and customize layouts using generated maps. Our method can adapt better to novel environments in zero-shot and is more robust to noise.
Machine Learning,Artificial Intelligence,Robotics
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of zero - shot navigation in completely new and unseen environments. Specifically, the researchers hope to develop an agent that can navigate to the target location in a maze layout that has never been seen before, without prior training or exploration in this environment. To achieve this goal, the agent will be provided with an abstract 2D top - down map, which provides information about the general layout of the environment as well as the starting point and the target location. #### Main challenges 1. **Zero - shot navigation**: The agent must find the optimal path from the starting point to the target in a brand - new environment without additional training or exploration. 2. **Correspondence between 2D map and 3D environment**: The agent needs to understand how to map the information on the 2D map to the actual 3D environment, which involves differences in state and action spaces. 3. **Generalization ability**: The agent needs to be able to learn from the trained maps and apply its knowledge to new, unseen maps. 4. **Robustness**: The agent should be able to deal with problems such as inaccurate maps or localization noise. #### Solutions The researchers proposed a model - based reinforcement learning method, called Map - conditioned Multi - task Navigator (MMN). This method solves the problem in the following ways: - **Task - conditioned Hypermodel**: Use HyperNetworks to generate task - specific transition network weights, thus adapting to different map layouts. - **Sampling - based Forward Search**: Plan on the learned dynamic model through methods such as Monte Carlo Tree Search (MCTS) to find the optimal path. - **Multi - task learning**: Improve the generalization ability for new tasks by jointly learning multiple tasks. Through these methods, the agent can quickly find a near - optimal path using the given abstract 2D map when first exposed to a new environment, without the need for further exploration or training. #### Experimental results Experiments show that MMN performs significantly better than model - free baseline methods (such as MAH) in long - distance navigation tasks and is more robust to inaccurate maps and localization noise.