Abstract:Existing learning-based point feature descriptors are usually task-agnostic, which pursue describing the individual 3D point clouds as accurate as possible. However, the matching task aims at describing the corresponding points consistently across different 3D point clouds. Therefore these too accurate features may play a counterproductive role due to the inconsistent point feature representations of correspondences caused by the unpredictable noise, partiality, deformation, etc., in the local geometry. In this paper, we propose to learn a robust task-specific feature descriptor to consistently describe the correct point correspondence under interference. Born with an Encoder and a Dynamic Fusion module, our method EDFNet develops from two aspects. First, we augment the matchability of correspondences by utilizing their repetitive local structure. To this end, a special encoder is designed to exploit two input point clouds jointly for each point descriptor. It not only captures the local geometry of each point in the current point cloud by convolution, but also exploits the repetitive structure from paired point cloud by Transformer. Second, we propose a dynamical fusion module to jointly use different scale features. There is an inevitable struggle between robustness and discriminativeness of the single scale feature. Specifically, the small scale feature is robust since little interference exists in this small receptive field. But it is not sufficiently discriminative as there are many repetitive local structures within a point cloud. Thus the resultant descriptors will lead to many incorrect matches. In contrast, the large scale feature is more discriminative by integrating more neighborhood information. But it is easier to be disturbed since there is much more interference in the large receptive field. Compared with the conventional fusion strategy that handles multiple scale features equally,- we analyze the consistency of them to judge the clean ones and perform larger aggregation weights on them during fusion. Then, a robust and discriminative feature descriptor is achieved by focusing on multiple clean scale features. Extensive evaluations validate that EDFNet learns a task-specific descriptor, which achieves state-of-the-art or comparable performance for robust matching of 3D point clouds.

Learning Enriched Feature Descriptor for Image Matching and Visual Measurement

Deep learning feature representation for image matching under large viewpoint and viewing direction change

A Light-weight Transformer-based Self-supervised Matching Network for Heterogeneous Images

Learning Local Event-based Descriptor for Patch-based Stereo Matching

Learning Hierarchical Visual Transformation for Domain Generalizable Visual Matching and Recognition

Hybrid Histogram Descriptor: A Fusion Feature Representation For Image Retrieval

HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning

D2Former: Jointly Learning Hierarchical Detectors and Contextual Descriptors Via Agent-based Transformers

Robust Local Feature Descriptor for Multisource Remote Sensing Image Registration

Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

Remote Sensing Image Matching Based on Adaptive Binning SIFT Descriptor

Learning a Task-Specific Descriptor for Robust Matching of 3D Point Clouds

Feature detection and description for image matching: from hand-crafted design to deep learning

OD-Net: Orthogonal descriptor network for multiview image keypoint matching

Image Matching and Localization Based on Fusion of Handcrafted and Deep Features

Learning Geometric Feature Embedding with Transformers for Image Matching

Descriptor Ensemble: An Unsupervised Approach to Descriptor Fusion in the Homography Space

HD2Reg: Hierarchical Descriptors and Detectors for Point Cloud Registration

Deep Descriptor Transforming for Image Co-Localization

Robust Feature Matching via Hierarchical Local Structure Visualization

LMFD: lightweight multi-feature descriptors for image stitching