Phase Discriminated Multi-Policy for Visual Room Rearrangement.

Beibei Wang,Xiaohan Wang,Xinhang Song,Yuehu Liu
DOI: https://doi.org/10.1109/smc53992.2023.10394019
2023-01-01
Abstract:Embodied AI, where the agent learns to accomplish tasks through interaction with its surrounding environment, is drawing increasing attention in the community. As a challenging Embodied AI task, visual room rearrangement aims to restore the initially misplaced objects in a room to the target state. Existing approaches usually use a single policy to learn a mapping from visual observation to action. Those methods may be capable of accomplishing tasks with simple goals such as visual navigation. However, the agent in the rearrangement task has to explore various types of interaction for a long time. Only considering a single policy may easily get stuck in local optimum. In this paper, we propose a Phase Discriminated Multi-Policy (PDMP) model, decomposing the task into specific phases and tackling them with customized policies. In particular, we first introduce the graph representation of object relationships providing scene layout knowledge, which is discriminated to task phases. Then based on the knowledge a hierarchical actor-critic module is proposed to dynamically call the policies capable of navigation or object interaction. Each policy is trained with narrowed action space and dense rewards so that they can better converge and cooperate to reach long-term goals. Comprehensive experiments based on the AI2-THOR platform, show that the proposed model achieves better performance than baselines.
What problem does this paper attempt to address?