Abstract:Robust local cross-domain feature descriptors of 2D images and 3D point clouds play an important role in 2D and 3D vision applications, e.g. augmented Reality (AR) and robot navigation. Essentially, the robust local cross-domain feature descriptors have the potential to establish a spatial relationship between 2D space and 3D space. However, it is challenging for manual-based or traditional deep learning-based methods to represent the invariant cross-domain feature descriptors between 2D images and 3D point clouds. Specifically, the mainstream point cloud deep learning network is used to extract the global structure information of the scene. Due to the dimensional difference, there is a large gap between the two-dimensional picture and the three-dimensional structure feature in feature accommodation. In this paper, based on the 2D image patch and 3D point cloud volume dataset, a novel network, 2D3D-MVPNet, is proposed to jointly learn robust local cross-domain feature descriptors between 2D images and 3D point clouds. The 2D3D-MVPNet contains a point cloud branch and an image branch, which are optimized with triplet loss and a second-order similarity regularization. Specifically, for the point cloud branch, first, a novel point cloud feature descriptor extractor, named the image-based point cloud encoder, is introduced to learn a local 3D feature descriptor consistent with the local 2D feature descriptor, so that the local 3D feature descriptors contain both geometry and colour texture information. Second, to overcome the challenge of random order of projected image inputs, a symmetric function is introduced to deal with the feature combination of point cloud projections. Experiments show that the local cross-domain feature descriptors of 2D images and 3D point clouds learned by 2D3D-MVPNet achieve extraordinary 2D to 3D retrieval performance. In addition, several 3D point cloud registration results demonstrate the effectiveness of the image-based point cloud encoder.

2D3D-MatchNet: Learning to Match Keypoints Across 2D Image and 3D Point Cloud

P2-Net - Joint Description and Detection of Local Features for Pixel and Point Matching.

KdO-Net: Towards Improving the Efficiency of Deep Convolutional Neural Networks Applied in the 3D Pairwise Point Feature Matching

D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features

2D3D-MATR: 2D-3D Matching Transformer for Detection-free Registration between Images and Point Clouds

2D3D-MVPNet: Learning cross-domain feature descriptors for 2D-3D matching based on multi-view projections of point clouds

Unconstrained Matching of 2D and 3D Descriptors for 6-DOF Pose Estimation

The Perfect Match: 3D Point Cloud Matching with Smoothed Densities

2D-3DMatchingNet: Multimodal Point Completion with 2D Geometry Matching

DeepICP: An End-to-End Deep Neural Network for 3D Point Cloud Registration

DDM-NET: End-to-end learning of keypoint feature Detection, Description and Matching for 3D localization

Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point Problem

KeyMatchNet: Zero-Shot Pose Estimation in 3D Point Clouds by Generalized Keypoint Matching

LodoNet: A Deep Neural Network with 2D Keypoint Matchingfor 3D LiDAR Odometry Estimation

Metric Learning for 2D Image Patch and 3D Point Cloud Volume Matching.

Matching 2D Image Patches and 3D Point Cloud Volumes by Learning Local Cross-domain Feature Descriptors

3Dpcp-Net: A Lightweight Progressive 3D Correspondence Pruning Network for Accurate and Efficient Point Cloud Registration

Two Heads Are Better than One: Image-Point Cloud Network for Depth-Based 3D Hand Pose Estimation

MatchNorm: Learning-based Point Cloud Registration for 6D Object Pose Estimation in the Real World

P^3-Net: Part Mobility Parsing from Point Cloud Sequences Via Learning Explicit Point Correspondence.

MVPointNet: Multi-View Network for 3D Object Based on Point Cloud