Abstract:The objective of deep metric learning (DML) is to learn embeddings that can capture semantic similarity and dissimilarity information among data points. Existing pairwise or tripletwise loss functions used in DML are known to suffer from slow convergence due to a large proportion of trivial pairs or triplets as the model improves. To improve this, ranking-motivated structured losses are proposed recently to incorporate multiple examples and exploit the structured information among them. They converge faster and achieve state-of-the-art performance. In this work, we unveil two limitations of existing ranking-motivated structured losses and propose a novel ranked list loss to solve both of them. First, given a query, only a fraction of data points is incorporated to build the similarity structure. Consequently, some useful examples are ignored and the structure is less informative. To address this, we propose to build a set-based similarity structure by exploiting all instances in the gallery. The learning setting can be interpreted as few-shot retrieval: given a mini-batch, every example is iteratively used as a query, and the rest ones compose the gallery to search, i.e., the support set in few-shot setting. The rest examples are split into a positive set and a negative set. For every mini-batch, the learning objective of ranked list loss is to make the query closer to the positive set than to the negative set by a margin. Second, previous methods aim to pull positive pairs as close as possible in the embedding space. As a result, the intraclass data distribution tends to be extremely compressed. In contrast, we propose to learn a hypersphere for each class in order to preserve useful similarity structure inside it, which functions as regularisation. Extensive experiments demonstrate the superiority of our proposal by comparing with the state-of-the-art methods on the fine-grained image retrieval task. Our source code is available online: https://github.com/XinshaoAmosWang/Ranked-List-Loss-for-DML.

Less is Better: Exponential Loss for Cross-Modal Matching

Universal Weighting Metric Learning for Cross-Modal Retrieval

Interclass-Relativity-Adaptive Metric Learning for Cross-Modal Matching and Beyond

Cross-modal Deep Metric Learning with Multi-Task Regularization

Ranked List Loss for Deep Metric Learning

The General Pair-based Weighting Loss for Deep Metric Learning

Contrasting Multiple Representations with the Multi-Marginal Matching Gap

Meta Self-Paced Learning for Cross-Modal Matching

A Novel Loss Function for Optical and SAR Image Matching: Balanced Positive and Negative Samples

Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval

Learning to Rematch Mismatched Pairs for Robust Cross-Modal Retrieval

Geometric Matching for Cross-Modal Retrieval

Deep Coupled Metric Learning for Cross-Modal Matching.

Improving Multimodal Learning with Multi-Loss Gradient Modulation

Cross-Modal Retrieval with Partially Mismatched Pairs

GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning

On Metric Learning for Audio-Text Cross-Modal Retrieval

Exponential Discriminative Metric Embedding in Deep Learning

Auxiliary Cross-Modal Representation Learning With Triplet Loss Functions for Online Handwriting Recognition

Quadruplet-Based Deep Cross-Modal Hashing

Joint Cluster Unary Loss for Efficient Cross-Modal Hashing.