Abstract:Most existing image-text matching methods adopt triplet loss as the optimization objective, and choosing a proper negative sample for the triplet of <anchor, positive, negative> is important for effectively training the model, e.g., hard negatives make the model learn efficiently and effectively. However, we observe that existing methods mainly employ the most similar samples as hard negatives, which may not be true negatives. In other words, the samples with high similarity but not paired with the anchor may reserve positive semantic associations, and we call them false negatives. Repelling these false negatives in triplet loss would mislead the semantic representation learning and result in inferior retrieval performance. In this paper, we propose a novel False Negative Elimination (FNE) strategy to select negatives via sampling, which could alleviate the problem introduced by false negatives. Specifically, we first construct the distributions of positive and negative samples separately via their similarities with the anchor, based on the features extracted from image and text encoders. Then we calculate the false negative probability of a given sample based on its similarity with the anchor and the above distributions via the Bayes' rule, which is employed as the sampling weight during negative sampling process. Since there may not exist any false negative in a small batch size, we design a memory module with momentum to retain a large negative buffer and implement our negative sampling strategy spanning over the buffer. In addition, to make the model focus on hard negatives, we reassign the sampling weights for the simple negatives with a cut-down strategy. The extensive experiments are conducted on Flickr30K and MS-COCO, and the results demonstrate the superiority of our proposed false negative elimination strategy. The code is available at <a class="link-external link-https" href="https://github.com/LuminosityX/FNE" rel="external noopener nofollow">this https URL</a>.

Automatically Extracting High-Quality Negative Examples for Answer Selection in Question Answering

A Question-Answering System over Traditional Chinese Medicine

SCENE: Self-Labeled Counterfactuals for Extrapolating to Negative Examples

Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets

Data-Driven Answer Selection in Community QA Systems.

Enhancing Retrieval Performance: An Ensemble Approach For Hard Negative Mining

SS-BERT: A Semantic Information Selecting Approach for Open-Domain Question Answering

QASnowball: An Iterative Bootstrapping Framework for High-Quality Question-Answering Data Generation

Challenging Instances are Worth Learning: Generating Valuable Negative Samples for Response Selection Training

Towards Automated Negative Sampling in Implicit Recommendation

Towards Robust Extractive Question Answering Models: Rethinking the Training Methodology

Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination

Detect, Retrieve, Comprehend: A Flexible Framework for Zero-Shot Document-Level Question Answering

Augmented Negative Sampling for Collaborative Filtering

Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval

Neighborhood-based Hard Negative Mining for Sequential Recommendation

Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering

Better Sampling of Negatives for Distantly Supervised Named Entity Recognition

Answerability in Retrieval-Augmented Open-Domain Question Answering

Enhancing Recommender Systems: A Strategy to Mitigate False Negative Impact

Mitigating the Impact of False Negatives in Dense Retrieval with Contrastive Confidence Regularization