2D3D-MatchNet: Learning to Match Keypoints Across 2D Image and 3D Point Cloud

Mengdan Feng,Sixing Hu,Marcelo Ang,Gim Hee Lee
DOI: https://doi.org/10.48550/arXiv.1904.09742
2019-04-22
Abstract:Large-scale point cloud generated from 3D sensors is more accurate than its image-based counterpart. However, it is seldom used in visual pose estimation due to the difficulty in obtaining 2D-3D image to point cloud correspondences. In this paper, we propose the 2D3D-MatchNet - an end-to-end deep network architecture to jointly learn the descriptors for 2D and 3D keypoint from image and point cloud, respectively. As a result, we are able to directly match and establish 2D-3D correspondences from the query image and 3D point cloud reference map for visual pose estimation. We create our Oxford 2D-3D Patches dataset from the Oxford Robotcar dataset with the ground truth camera poses and 2D-3D image to point cloud correspondences for training and testing the deep network. Experimental results verify the feasibility of our approach.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?