Abstract:With the improvement in the quantity and quality of remote sensing images, content-based remote sensing object retrieval (CBRSOR) has become an increasingly important topic. However, existing CBRSOR methods neglect the utilization of global statistical information during both training and test stages, which leads to the overfitting of neural networks to simple sample pairs of samples during training and suboptimal metric performance. Inspired by the Neyman-Pearson theorem, we propose a generalized likelihood ratio test-based metric learning (GLRTML) approach, which can estimate the relative difficulty of sample pairs by incorporating global data distribution information during training and test phases. This guides the network to focus more on difficult samples during the training process, thereby encourages the network to learn more discriminative feature embeddings. In addition, GLRT is a more effective than traditional metric space due to the utilization of global data distribution information. Accurately estimating the distribution of embeddings is critical for GLRTML. However, in real-world applications, there is often a distribution shift between the training and target domains, which diminishes the effectiveness of directly using the distribution estimated on training data. To address this issue, we propose the clustering pseudo-labels-based fast parameter adaptation (CPLFPA) method. CPLFPA efficiently estimates the distribution of embeddings in the target domain by clustering target domain instances and re-estimating the distribution parameters for GLRTML. We reorganize datasets for CBRSOR tasks based on fine-grained ship remote sensing image slices (FGSRSI-23) and military aircraft recognition (MAR20) datasets. Extensive experiments on these datasets demonstrate the effectiveness of our proposed GLRTML and CPLFPA.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: Existing content - based remote sensing object retrieval (CBRSOR) methods fail to effectively utilize global statistical information during the training and testing phases, resulting in the model over - fitting to simple sample pairs and sub - optimal metric performance. Specifically: 1. **Limitations of existing methods**: - Existing CBRSOR methods ignore the utilization of global statistical information during the training and testing phases. - Due to GPU memory limitations, current methods can only process a limited number of data batches, causing the loss function to be calculated based only on local data relationships and easily over - fit to simple sample pairs. - This over - fitting phenomenon makes it difficult for the network to learn more discriminative feature embeddings from more challenging sample pairs, thus affecting the generalization ability of the model. 2. **Importance of introducing global statistical information**: - The paper points out that by introducing global statistical information, the relative difficulty of sample pairs can be estimated more accurately, making the network pay more attention to difficult samples during the training process. - This helps to improve the generalization ability and overall performance of the model and avoid over - fitting. 3. **The proposed new method**: - Inspired by the Neyman - Pearson theorem, the author proposes a metric learning method based on the generalized likelihood ratio test (GLRTML). - GLRTML estimates the relative difficulty of sample pairs during the training and testing phases by combining global data distribution information, guiding the network to learn more discriminative feature embeddings. - In addition, in order to deal with possible domain differences between the training set and the test set, the author also proposes a fast parameter adaptation method based on clustering pseudo - labels (CPLFPA) to efficiently re - estimate the distribution parameters in the target domain. In summary, this paper aims to improve the performance of the CBRSOR task and the generalization ability of the model by introducing global statistical information and solving the domain difference problem. ### Formula display The formulas involved in the paper are as follows: - Definition of likelihood ratio: \[ s(i, j)=\log \left(\frac{p\left(I_{i}, I_{j} \mid H_{1}\right)}{p\left(I_{i}, I_{j} \mid H_{0}\right)}\right) \] Calculated in the embedding space: \[ s(i, j)=\log \left(\frac{p\left(x_{\theta, i}, x_{\theta, j} \mid H_{1}\right)}{p\left(x_{\theta, i}, x_{\theta, j} \mid H_{0}\right)}\right) \] - Differential embedding representation: \[ s(i, j)=\log \left(\frac{p\left(x_{\theta, i j} \mid H_{1}\right)}{p\left(x_{\theta, i j} \mid H_{0}\right)}\right) \] - Likelihood ratio under the multivariate Gaussian distribution assumption: \[ s(i, j)=\frac{1}{2}\left(x_{\theta, i j}-\mu_{0}\right)^{T} \Sigma_{0}^{-1}\left(x_{\theta, i j}-\mu_{0}\right)-\frac{1}{2}\left(x_{\theta, i j}-\mu_{1}\right)^{T} \Sigma_{1}^{-1}\left(x_{\theta, i j}-\mu_{1}\right)+C_{0} \] - Final simplified similarity score (MG - GLRTML): \[ s(i, j)=x_{\theta, i j}^{T}\left(\Sigma_{0}^{-1}-\Sigma_{1}^{-1}\right) x_{\theta, i j} \] - Maximum likelihood estimation of the covariance matrix: \[ \hat{\Sigma}_{1}=\frac{1}{N_{1}} \sum_{l = 0}^{N_{1}-1}\left(x_{\theta, l}^{+}\right)\left(x_{\theta, l}^{+}\right)^{T}, \quad x_{\theta, l}^{+} \in X_{1} \] \[ \hat{\Sigma}_{0}=\frac{1}{N_{0}} \sum_{

GLRT-Based Metric Learning for Remote Sensing Object Retrieval

Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information

Annotation Cost-Efficient Active Learning for Deep Metric Learning Driven Remote Sensing Image Retrieval

AGL-NET: Aerial-Ground Cross-Modal Global Localization with Varying Scales

Cross-Attention-Driven Adaptive Graph Relational Network for Multilabel Remote Sensing Scene Classification

Proxy-Based Rotation Invariant Deep Metric Learning for Remote Sensing Image Retrieval

A Novel Metric Learning Method Based On The Triplet Sampling Graph Convolutional Network For Remote Sensing Image Retrieval

A Lightweight Multi-Scale Crossmodal Text-Image Retrieval Method in Remote Sensing

Revisiting Local and Global Descriptor-Based Metric Network for Few-Shot SAR Target Classification

Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval

A Novel Graph-Theoretic Deep Representation Learning Method for Multi-Label Remote Sensing Image Retrieval

Class-level Prototype Guided Multi-Scale Feature Learning for Remote Sensing Scene Classification with Limited Labels

Integrating Multisubspace Joint Learning With Multilevel Guidance for Cross-Modal Retrieval of Remote Sensing Images

Robust Cross-Modal Remote Sensing Image Retrieval Via Maximal Correlation Augmentation

Meta-Graph Representation Learning for PolSAR Image Classification

Unsupervised Collaborative Metric Learning with Mixed-Scale Groups for General Object Retrieval

Learning Critical Features for Arbitrary-Oriented Object Detection in Remote-Sensing Optical Images

Domain-invariant Similarity Activation Map Metric Learning for Retrieval-based Long-term Visual Localization.

Knowledge-Aided Momentum Contrastive Learning for Remote-Sensing Image Text Retrieval

Randomized Spectrum Transformations for Adapting Object Detector in Unseen Domains

Toward Multiparty Personalized Collaborative Learning in Remote Sensing