Scale Adaptive Skip-ASP Pixel Voting Network for Monocular 6D Object Pose Estimation

Li Wu,Xin Ma
DOI: https://doi.org/10.23919/CCC58697.2023.10240199
2023-01-01
Abstract:Estimating the 6D object pose from an RGB image is a challenging task in computer vision. While keypoint-based methods have recently demonstrated promising results, they are still inferior when dealing with small objects in the image. Aiming at this problem, we perform an in-depth pose error analysis on the keypoint-based methods and propose a Scale Adaptive Skip-ASP Pixel Voting Network (SASA-PVNet), which adaptively scales small objects in the input image to a suitable size and scales the output 2D keypoints back to the original input image during PnP. We also design a Skip-ASP module between the encoder and decoder in the framework to reduce the effects of distortion and blur caused by scaling the input image through multi-scale information fusion. Experiments demonstrate that our method outperforms the state-of-the-art on the LINEMOD dataset by a large margin and obtains competitive results on the Occlusion LINEMOD dataset.
What problem does this paper attempt to address?