Abstract:Recently, cross-spectral image patch matching based on feature relation learning has attracted extensive attention. However, performance bottleneck problems have gradually emerged in existing methods. To address this challenge, we make the first attempt to explore a stable and efficient bridge between descriptor learning and metric learning, and construct a knowledge-guided learning network (KGL-Net), which achieves amazing performance improvements while abandoning complex network structures. Specifically, we find that there is feature extraction consistency between metric learning based on feature difference learning and descriptor learning based on Euclidean distance. This provides the foundation for bridge building. To ensure the stability and efficiency of the constructed bridge, on the one hand, we conduct an in-depth exploration of 20 combined network architectures. On the other hand, a feature-guided loss is constructed to achieve mutual guidance of features. In addition, unlike existing methods, we consider that the feature mapping ability of the metric branch should receive more attention. Therefore, a hard negative sample mining for metric learning (HNSM-M) strategy is constructed. To the best of our knowledge, this is the first time that hard negative sample mining for metric networks has been implemented and brings significant performance gains. Extensive experimental results show that our KGL-Net achieves SOTA performance in three different cross-spectral image patch matching scenarios. Our code are available at <a class="link-external link-https" href="https://github.com/YuChuang1205/KGL-Net" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

This paper attempts to solve the performance bottleneck problem in cross - spectral image patch matching. Specifically, existing methods gradually show the bottleneck of performance improvement when facing the matching between different spectral images. These methods need to deal with issues such as illumination changes, geometric changes, and pixel - level non - linear differences between cross - spectral image patches. Therefore, it is challenging to construct a high - performance cross - spectral image patch matching method. To solve this problem, the author proposes a stable and efficient bridge, connecting descriptor learning and metric learning, and constructs a knowledge - guided learning network (KGL - Net). KGL - Net achieves significant performance improvement in the following ways: 1. **Feature extraction consistency**: The author finds that there is feature extraction consistency between metric learning based on feature difference learning and descriptor learning based on Euclidean distance. This consistency provides the basis for building a bridge between the two. 2. **Explore multiple network architectures**: To ensure the stability and efficiency of the bridge, the author deeply explores 20 different combined network architectures and finally selects the C3 architecture. In the C3 architecture, the metric network adopts a pseudo - siamese structure, the descriptor network adopts a siamese structure, and the lower - level network layers share parameters to achieve more effective feature extraction. 3. **Hard negative sample mining strategy (HNSM - M)**: The author proposes a new hard negative sample mining strategy for metric learning. By using only positive sample pairs as input and randomly generating negative sample pairs for learning, this method can effectively improve the discriminative ability of the metric branch. 4. **Feature - guided loss**: To ensure that the hard negative sample positions obtained by the descriptor network can have strict guiding significance for the metric network, the author constructs a feature - guided loss function. This loss function ensures that the high - level feature maps of the two methods can guide each other's learning. Through these innovations, KGL - Net not only avoids the use of complex network structures but also achieves state - of - the - art performance in multiple cross - spectral image patch matching scenarios. Experimental results show that the FPR95 of KGL - Net on the VIS - NIR dataset is 36.5% lower than that of the latest FIL - Net, and the number of parameters is reduced by 30.1%. ### Formula summary - **Feature extraction consistency formula**: \[ f_m(V_p, N_p)=\phi_m(f(V), f(N)) \] \[ f_d(V_p, N_p)=\phi_d(f(V), f(N)) \] - **Feature difference learning and Euclidean distance formula**: \[ S_{\text{out}} = m(\phi'(f_V)-\phi'(f_N)) \] \[ \text{dist}_{\text{out}}=\| \phi_d'(f_V')-\phi_d'(f_N') \|^2 \] - **Feature distance matrix construction formula**: \[ M_{ij}=\| d_i^V - d_j^N \|^2 \] - **Feature - guided loss formula**: \[ L_{fg}^v=\frac{1}{N}\sum_{i = 1}^N\| f_v' - f_v \|^2 \] \[ L_{fg}^n=\frac{1}{N}\sum_{i = 1}^N\| f_n' - f_n \|^2 \] - **Total loss function formula**: \[ L = L_d+L_m+\alpha L_{fg}^v+\beta L_{fg}^n \] These formulas show how KGL - Net successfully solves the performance bottleneck problem in cross - spectral image patch matching through methods such as feature extraction consistency, hard negative sample mining, and feature - guided loss.

Why and How: Knowledge-Guided Learning for Cross-Spectral Image Patch Matching

Multibranch Feature Difference Learning Network for Cross-Spectral Image Patch Matching

Spnet: A Spectral Patching Network For End-To-End Hyperspectral Image Classification

Efficient Feature Relation Learning Network for Cross-Spectral Image Patch Matching

Feature Interaction Learning Network for Cross-Spectral Image Patch Matching

Learning to Match Features with Discriminative Sparse Graphneuralnetwork

Image Patch-Matching with Graph-Based Learning in Street Scenes

A Novel Neural Network for Remote Sensing Image Matching

Explore Better Network Framework for High-Resolution Optical and SAR Image Matching

Metric Learning for Patch-Based 3-D Image Registration.

Optical and SAR Image Matching Using Pixelwise Deep Dense Features

Joint Graph Learning and Matching for Semantic Feature Correspondence

Understanding Hyperbolic Metric Learning through Hard Negative Sampling

Improving Sparse Graph Attention for Feature Matching by Informative Keypoints Exploration.

Metric networks for enhanced perception of non-local semantic information

Rein-SLAM: Narrow the Gaps Between the Matching Task and SLAM System.

HECPG: Hyperbolic Embedding and Confident Patch-Guided Network for Point Cloud Matching

StateNet: Deep State Learning for Robust Feature Matching of Remote Sensing Images

Learning Visual Instance Retrieval from Failure: Efficient Online Local Metric Adaptation from Negative Samples.

Deep Feature Correlation Learning for Multi-Modal Remote Sensing Image Registration

Relationship Learning From Multisource Images via Spatial-Spectral Perception Network