A Method for Unseen Object Six Degrees of Freedom Pose Estimation Based on Segment Anything Model and Hybrid Distance Optimization

Li Xin,Hu Lin,Xinjun Liu,Shiyu Wang
DOI: https://doi.org/10.3390/electronics13040774
IF: 2.9
2024-02-17
Electronics
Abstract:Six degrees of freedom pose estimation technology constitutes the cornerstone for precise robotic control and similar tasks. Addressing the limitations of current 6-DoF pose estimation methods in handling object occlusions and unknown objects, we have developed a novel two-stage 6-DoF pose estimation method that integrates RGB-D data with CAD models. Initially, targeting high-quality zero-shot object instance segmentation tasks, we innovated the CAE-SAM model based on the SAM framework. In addressing the SAM model's boundary blur, mask voids, and over-segmentation issues, this paper introduces innovative strategies such as local spatial-feature-enhancement modules, global context markers, and a bounding box generator. Subsequently, we proposed a registration method optimized through a hybrid distance metric to diminish the dependency of point cloud registration algorithms on sensitive hyperparameters. Experimental results on the HQSeg-44K dataset substantiate the notable improvements in instance segmentation accuracy and robustness rendered by the CAE-SAM model. Moreover, the efficacy of this two-stage method is further corroborated using a 6-DoF pose dataset of workpieces constructed with CloudCompare and RealSense. For unseen targets, the ADD metric achieved 2.973 mm, and the ADD-S metric reached 1.472 mm. This paper significantly enhances pose estimation performance and streamlines the algorithm's deployment and maintenance procedures.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?
The paper primarily focuses on solving the problem of six degrees of freedom (6-DoF) pose estimation, especially in scenarios involving object occlusion and unknown objects. Specifically, the research team developed a novel two-stage 6-DoF pose estimation method that combines RGB-D data with computer-aided design (CAD) models. The key issues addressed by the paper are as follows: 1. **Zero-shot instance segmentation**: To improve the quality of target instance segmentation, particularly for unknown objects, the paper proposes a Context-Aware Enhanced SAM (CAE-SAM) model based on the Segment Anything Model (SAM) framework. By introducing innovative strategies such as a local spatial feature enhancement module, global context markers, and a bounding box generator, the model addresses issues in the SAM model such as boundary blurring, mask holes, and over-segmentation. 2. **Point cloud registration optimization**: To address the dependency of point cloud registration algorithms on sensitive hyperparameters, the paper proposes an optimized registration method based on hybrid distance metrics. This method reduces the algorithm's reliance on high-quality large-scale training sets and simplifies the deployment and maintenance process of the algorithm. 3. **6-DoF pose estimation**: The goal of the paper is to achieve high-precision pose estimation in complex scenes, particularly for stacked objects and unknown objects. To this end, the researchers use RGB images for high-quality zero-shot instance segmentation and extract the corresponding object's point cloud data from the depth map. Subsequently, these point cloud data are geometrically registered with CAD models to estimate the 6-DoF pose of each object. Through the aforementioned methods, the paper significantly enhances the performance of 6-DoF pose estimation and simplifies the deployment and maintenance process of the algorithm, which is of great significance for fields such as robotic control and automated systems.