Abstract:We present an online planning framework for solving multi-object rearrangement problems in partially observable, multi-room environments. Current object rearrangement solutions, primarily based on Reinforcement Learning or hand-coded planning methods, often lack adaptability to diverse challenges. To address this limitation, we introduce a novel Hierarchical Object-Oriented Partially Observed Markov Decision Process (HOO-POMDP) planning approach. This approach comprises of (a) an object-oriented POMDP planner generating sub-goals, (b) a set of low-level policies for sub-goal achievement, and (c) an abstraction system converting the continuous low-level world into a representation suitable for abstract planning. We evaluate our system on varying numbers of objects, rooms, and problem types in AI2-THOR simulated environments with promising results.

What problem does this paper attempt to address?

This paper attempts to solve the multi - object rearrangement problem, especially complex tasks in partially observable multi - room environments. Specifically, the article points out that current object rearrangement solutions (mainly based on reinforcement learning or hand - coded planning methods) generally lack the ability to adapt to various challenges. To address this limitation, the authors introduce a novel Hierarchical Object - Oriented Partially Observable Markov Decision Process (HOO - POMDP) planning method. ### Problem Background In real - life home environments, multi - object rearrangement is a fundamental challenge, involving complex perception, planning, navigation, and manipulation tasks. In multi - room settings, this problem becomes more difficult because most of the environment is invisible at any given time. These scenarios are very common in daily life, such as tidying up the home or organizing groceries, and are therefore crucial for the development of next - generation home - assistive robots. ### Limitations of Existing Methods Existing multi - object rearrangement methods are mainly divided into two categories: 1. **Reinforcement Learning (RL) Methods**: As problems become more complex and lengthy, RL methods often struggle to scale to more challenging scenarios. 2. **Hand - Coded Planning Systems**: These methods usually presuppose the order of skill application or use greedy planners, limiting their potential in determining the optimal interaction sequence and handling new problems (such as blocked paths or occluded target locations). ### The Method Proposed in the Paper To solve the above problems, the authors propose the HOO - POMDP planning framework, which mainly includes the following components: - **Object - Oriented POMDP Planner**: Generates sub - goals. - **Set of Low - Level Policies**: Used to achieve sub - goals. - **Abstraction System**: Transforms the continuous low - level world into a representation suitable for abstract planning. ### Main Contributions The main contributions of the paper include: 1. **Modular Planning System**: Comprising an object - oriented planner and a state - abstraction module, suitable for object rearrangement in multi - room environments. 2. **New Dataset**: Contains blocked - path problems and extended room configurations, as well as existing rearrangement challenges. 3. **Empirical Evaluation**: Evaluates the system under different conditions in the AI2Thor simulation environment. Through this method, the system can efficiently handle complex rearrangement tasks in partially observable multi - room environments and can adapt to new problems, such as blocked paths or occluded target locations. ### Formula Representation The formulas involved in the paper are represented in Markdown format as follows: - POMDP Definition: \[ \text{POMDP}=(S, A, T, R, \gamma, O, O_{\text{model}}) \] - State Space \(S\) - Action Space \(A\) - Transition Function \(T(s, a, s') = p(s'|s, a)\) - Observation Model \(O_{\text{model}}(s, a, z)=p(z|s, a)\) - Reward Function \(R(s, a)\) - Discount Factor \(\gamma\) - Belief Update Formula: \[ b'(s')=\eta O(s', a, z)\sum_{s\in S}T(s, a, s')b(s) \] Through these improvements, the HOO - POMDP framework can handle multi - object rearrangement tasks more effectively and perform well in partially observable multi - room environments.

Hierarchical Object-Oriented POMDP Planning for Object Rearrangement

Hierarchical POMDP Planning for Object Manipulation in Clutter

Efficient Planning for Object Search Task Based on Hierarchical POMDP

Task Planning for Object Rearrangement in Multi-room Environments

A POMDP-based hierarchical planning framework for manipulation under pose uncertainty

Efficient Object Manipulation to an Arbitrary Goal Pose: Learning-based Anytime Prioritized Planning

Knowledge-Based Hierarchical POMDPs for Task Planning

Multilevel Monte-Carlo for Solving POMDPs Online

Robotic manipulation of multiple objects as a POMDP

Constrained Hierarchical Monte Carlo Belief-State Planning

MANER: Multi-Agent Neural Rearrangement Planning of Objects in Cluttered Environments

A Framework for Sequential Planning in Multi-Agent Settings

Scalable Planning and Learning for Multiagent POMDPs: Extended Version

Online Planning in POMDPs with State-Requests

Adaptive Online Packing-guided Search for POMDPs

Scaling Long-Horizon Online POMDP Planning via Rapid State Space Sampling

Object-Centric Kinodynamic Planning for Nonprehensile Robot Rearrangement Manipulation

PODDP: Partially Observable Differential Dynamic Programming for Latent Belief Space Planning

A Partially Observable Monte Carlo Planning Algorithm Based on Path Modification.

Hierarchical Policy for Non-prehensile Multi-object Rearrangement with Deep Reinforcement Learning and Monte Carlo Tree Search

Hybrid Heuristic Online Planning for POMDPs