Towards Real World Debiasing: A Fine-grained Analysis On Spurious Correlation

Zhibo Wang,Peng Kuang,Zhixuan Chu,Jingyi Wang,Kui Ren

2024-05-30

Abstract:Spurious correlations in training data significantly hinder the generalization capability of machine learning models when faced with distribution shifts in real-world scenarios. To tackle the problem, numerous debias approaches have been proposed and benchmarked on datasets intentionally designed with severe biases. However, it remains to be asked: \textit{1. Do existing benchmarks really capture biases in the real world? 2. Can existing debias methods handle biases in the real world?} To answer the questions, we revisit biased distributions in existing benchmarks and real-world datasets, and propose a fine-grained framework for analyzing dataset bias by disentangling it into the magnitude and prevalence of bias. We observe and theoretically demonstrate that existing benchmarks poorly represent real-world biases. We further introduce two novel biased distributions to bridge this gap, forming a nuanced evaluation framework for real-world debiasing. Building upon these results, we evaluate existing debias methods with our evaluation framework. Results show that existing methods are incapable of handling real-world biases. Through in-depth analysis, we propose a simple yet effective approach that can be easily applied to existing debias methods, named Debias in Destruction (DiD). Empirical results demonstrate the superiority of DiD, improving the performance of existing methods on all types of biases within the proposed evaluation framework.

Machine Learning,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper attempts to address the issue of insufficient generalization ability of machine learning models when facing distribution shifts in the real world, due to spurious correlations present in the training data. Specifically, the paper focuses on whether existing debiasing benchmark datasets can truly reflect real-world shifts and evaluates whether current debiasing methods can handle real-world shift problems. To answer these questions, the authors propose a fine-grained framework to analyze shifts in datasets, breaking them down into magnitude and prevalence. Through observation and theoretical proof, the authors find that existing benchmark datasets do not well represent real-world shifts. Additionally, the authors introduce two new shift distributions to bridge this gap and form a detailed evaluation framework to assess debiasing effectiveness in the real world. Based on these results, the authors evaluate existing debiasing methods and find that they perform poorly in handling real-world shifts. Ultimately, the authors propose a simple yet effective enhancement method called "Debias in Destruction" (DiD), which can significantly improve the performance of existing methods under various types of shifts.

Towards Real World Debiasing: A Fine-grained Analysis On Spurious Correlation

Mitigating Bias in Dataset Distillation

Backdoor for Debias: Mitigating Model Bias with Backdoor Attack-based Artificial Bias

Debiasing Intrinsic Bias and Application Bias Jointly Via Invariant Risk Minimization (student Abstract).

Debias Can be Unreliable: Mitigating Bias Issue in Evaluating Debiasing Recommendation

DeNetDM: Debiasing by Network Depth Modulation

Revisiting the Dataset Bias Problem from a Statistical Perspective

Fair Visual Recognition via Intervention with Proxy Features

HateDebias: On the Diversity and Variability of Hate Speech Debiasing

Exploring the Impact of Dataset Bias on Dataset Distillation

Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness

Debiasing Stance Detection Models with Counterfactual Reasoning and Adversarial Bias Learning

AutoDebias: Learning to Debias for Recommendation

Adaptive Data Debiasing through Bounded Exploration

Bias and Debias in Recommender System: A Survey and Future Directions

Fine-tuning a Biased Model for Improving Fairness

IBADR: an Iterative Bias-Aware Dataset Refinement Framework for Debiasing NLU models

De-biasing "bias" measurement

Debiasing NLU Models via Causal Intervention and Counterfactual Reasoning

Bounding System-Induced Biases in Recommender Systems with A Randomized Dataset

A Multi-modal Debiasing Model with Dynamical Constraint for Robust Visual Question Answering