HomeRobot Open Vocabulary Mobile Manipulation Challenge 2023 Participant Report (Team KuzHum)

Volodymyr Kuzma,Vladyslav Humennyy,Ruslan Partsey
2024-01-22
Abstract:We report an improvements to NeurIPS 2023 HomeRobot: Open Vocabulary Mobile Manipulation (OVMM) Challenge reinforcement learning baseline. More specifically, we propose more accurate semantic segmentation module, along with better place skill policy, and high-level heuristic that outperforms the baseline by 2.4% of overall success rate (sevenfold improvement) and 8.2% of partial success rate (1.75 times improvement) on Test Standard split of the challenge dataset. With aforementioned enhancements incorporated our agent scored 3rd place in the challenge on both simulation and real-world stages.
Robotics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve open - vocabulary mobile manipulation tasks (OVMM) in a home environment. Specifically, the research team aims to improve the robot's navigation ability in unknown environments, find specified objects, and move them to specified target containers. The focus of the paper is on improving the baseline method based on reinforcement learning, enhancing the overall success rate and partial success rate by enhancing the semantic segmentation module, optimizing the placement - skill strategy, and introducing advanced heuristic algorithms. ### Main Contributions 1. **Improvement of the Semantic Segmentation Module**: - Using the retrained YOLOv8 object detection model and MobileSAM segmentation model, the understanding ability of the environment is improved. - Combining with the Detic perception module, more accurate semantic segmentation masks are generated, especially in identifying small objects and furniture types. 2. **Optimization of the Placement - Skill Strategy**: - By analyzing the placement - skill performance of the baseline method, the existing bottlenecks are identified. - The reward function is adjusted to better guide the agent's behavior when placing objects, reducing unstable placements and cases of missing the target container. 3. **Introduction of Advanced Heuristic Algorithms**: - A more complex high - level strategy is designed, and through conditional loops, it is ensured that subsequent tasks are not carried out before successfully grasping the object. - The success rates of navigation and placement tasks are improved, especially in the case of partial success. ### Experimental Results - On the test standard data set, the improved agent has a 7 - fold increase in the overall success rate (from 0.4% to 2.8%) and a 1.75 - fold increase in the partial success rate (from 10.9% to 19.1%). - In virtual and real - world competitions, the agent has achieved the third - place respectively. ### Future Work - **Object Tracker**: By introducing an object tracker, the problem of object disappearance in consecutive frames is prevented, and the stability of the agent during navigation and manipulation is improved. - **Improvement of Strategies and Skills**: Further optimize the training of high - level strategies and individual skills to improve the overall performance of the agent. - **World Representation**: By constructing the world representation of the environment, storing known object and path information, the exploration process is optimized, and unnecessary repeated exploration is reduced. ### Summary Although significant improvements have been made, the current method has not yet fully solved the OVMM task. Future work needs to continue efforts in semantic segmentation, object tracking, strategy optimization, etc., to further improve the performance of robots.