Abstract:Efficient and accurate product relevance assessment is critical for user experiences and business success. Training a proficient relevance assessment model requires high-quality query-product pairs, often obtained through negative sampling strategies. Unfortunately, current methods introduce pooling bias by mistakenly sampling false negatives, diminishing performance and business impact. To address this, we present Bias-mitigating Hard Negative Sampling (BHNS), a novel negative sampling strategy tailored to identify and adjust for false negatives, building upon our original False Negative Estimation algorithm. Our experiments in the Instacart search setting confirm BHNS as effective for practical e-commerce use. Furthermore, comparative analyses on public dataset showcase its domain-agnostic potential for diverse applications.

What problem does this paper attempt to address?

This paper attempts to address the issue of pooling bias introduced by negative sample sampling strategies in e-commerce search. Specifically, current methods mistakenly label some actually relevant query-product pairs (i.e., false negatives) as irrelevant during negative sample sampling, which reduces model performance and business impact. The paper proposes a new negative sample sampling strategy—Bias-mitigating Hard Negative Sampling (BHNS), aimed at identifying and adjusting these false negatives to improve the performance of relevance evaluation models in e-commerce search. ### Main Contributions: 1. **Proposing the BHNS Strategy**: Combining the False Negative Estimation (FNE) algorithm, it adjusts the negative sample sampling process by estimating the probability of query-product pairs becoming false negatives. 2. **Experimental Validation**: Experiments were conducted in Instacart's actual search scenarios and on public datasets to verify the effectiveness and generalization ability of BHNS. 3. **Dual Insurance Mechanism**: Further reduces the impact of pooling bias through two methods: sampling regularization and pseudo-label generation. ### Method Overview: - **False Negative Estimation (FNE)**: Uses semantic similarity to estimate the probability of query-product pairs becoming false negatives. Specifically, if two queries are semantically similar, the relevance of the same product corresponding to them may also be similar. - **Sampling Regularization**: When selecting hard negative samples, FNE is introduced as a regularization term to reduce the probability of false negatives being mistakenly selected. - **Pseudo-label Generation**: Generates pseudo-labels for potential false negatives to further reduce pooling bias. ### Experimental Results: - **Performance on Public Datasets**: Experimental results on the STS benchmark dataset show that BHNS outperforms other baseline methods on multiple evaluation metrics, especially in handling false negatives. - **Offline Experiments**: On Instacart's actual dataset, BHNS also demonstrated superior performance, effectively mitigating pooling bias and improving the accuracy of search relevance evaluation. ### Conclusion: The BHNS strategy proposed in the paper effectively addresses the issue of pooling bias introduced by negative sample sampling in e-commerce search. Through false negative estimation and a dual insurance mechanism, it significantly improves model performance and business outcomes.

Mitigating Pooling Bias in E-commerce Search via False Negative Estimation

Hard Negatives or False Negatives: Correcting Pooling Bias in Training Neural Ranking Models

Crowdsourcing Detection of Sampling Biases in Image Datasets

Matryoshka Peek: Toward Learning Fine-Grained, Robust, Discriminative Features for Product Search

Enhancing Recommender Systems: A Strategy to Mitigate False Negative Impact

Adaptive Hardness Negative Sampling for Collaborative Filtering

SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval

Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination

Learning Robust Models for e-Commerce Product Search

Enhanced Bayesian Personalized Ranking for Robust Hard Negative Sampling in Recommender Systems

Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering

Commonsense Knowledge Salience Evaluation with a Benchmark Dataset in E-commerce

Augmented Negative Sampling for Collaborative Filtering

Relevance Filtering for Embedding-based Retrieval

Towards Automated Negative Sampling in Implicit Recommendation

Learning a Product Relevance Model from Click-Through Data in E-Commerce

Evaluating Performance and Bias of Negative Sampling in Large-Scale Sequential Recommendation Models

Bayesian Negative Sampling for Recommendation

ABNS: Association-based negative sampling for collaborative filtering

Addressing Marketing Bias in Product Recommendations

Self-Sampling Training and Evaluation for the Accuracy-Bias Tradeoff in Recommendation