Abstract:With the development of convolutional neural networks (CNNs), the semantic understanding of remote sensing (RS) scenes has been significantly improved based on their prominent feature encoding capabilities. While many existing deep-learning models focus on designing different architectures, only a few works in the RS field have focused on investigating the performance of the learned feature embeddings and the associated metric space. In particular, two main loss functions have been exploited: the contrastive and the triplet loss. However, the straightforward application of these techniques to RS images may not be optimal in order to capture their neighborhood structures in the metric space due to the insufficient sampling of image pairs or triplets during the training stage and to the inherent semantic complexity of remotely sensed data. To solve these problems, we propose a new deep metric learning approach, which overcomes the limitation on the class discrimination by means of two different components: 1) scalable neighborhood component analysis (SNCA) that aims at discovering the neighborhood structure in the metric space and 2) the cross-entropy loss that aims at preserving the class discrimination capability based on the learned class prototypes. Moreover, in order to preserve feature consistency among all the minibatches during training, a novel optimization mechanism based on momentum update is introduced for minimizing the proposed loss. An extensive experimental comparison (using several state-of-the-art models and two different benchmark data sets) has been conducted to validate the effectiveness of the proposed method from different perspectives, including: 1) classification; 2) clustering; and 3) image retrieval. The related codes of this article will be made publicly available for reproducible research by the community.

Measuring similarity between embedding spaces using induced neighborhood graphs

Bridging the Semantic Latent Space Between Brain and Machine: Similarity is All You Need

Deep Metric Learning using Similarities from Nonlinear Rank Approximations

Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation.

Deep Metric Learning Based on Scalable Neighborhood Components for Remote Sensing Scene Characterization

Video Annotation by Graph-Based Learning with Neighborhood Similarity.

Decoupling Semantic Similarity from Spatial Alignment for Neural Networks

A New Similarity Space Tailored for Supervised Deep Metric Learning

Evaluating Representational Similarity Measures from the Lens of Functional Correspondence

An Angle-Based Method for Measuring the Semantic Similarity Between Visual and Textual Features

Hierarchical Large-scale Graph Similarity Computation via Graph Coarsening and Matching

Similarity of Neural Network Representations Revisited

Towards Measuring Place Function Similarity at Fine Spatial Granularity with Trajectory Embedding

Learning Similarity Measures in Non-Orthogonal Space.

A new similarity measure for vector space models in text classification and information retrieval

Sim2vec: Node Similarity Preserving Network Embedding.

Do Similar Entities have Similar Embeddings?

Hierarchical and Fast Graph Similarity Computation Via Graph Coarsening and Deep Graph Learning.

Obtaining Cross Modal Similarity Metric with Deep Neural Architecture

Differentiable Optimization of Similarity Scores Between Models and Brains

Evaluation of taxonomic and neural embedding methods for calculating semantic similarity