Abstract:Computer vision technologies have been applied to an increasingly wide range of applications from autonomous car navigation, to medical image analysis, to precision agriculture. Despite many of these exciting innovations, recent studies reveal a number of risks in using existing computer vision systems, suggesting results of such systems may be unfair or untrustworthy. For example, major commercial facial analysis tools were shown to have substantial accuracy disparities for people of different gender or with different skin colors (Buolamwini and Gebru 2018). Visual semantic role labeling models were found to exhibit societal biases and stereotypes (Zhao et al. 2017) such as frequently associating certain activity labels with specific gender (e.g., associate “cooking” with woman). Even worse, seemly accurate image classifiers may in fact made the predictions by picking up spurious correlations between objects and irrelevant background information rather than identifying meaningful features of the objects (Ribeiro, Singh, and Guestrin 2016). Many of the risks embedded in modern computer vision systems can be attributed to the use of a training dataset that is biased. Indeed, the computer vision community has long recognized that many visual datasets present varying degrees of build-in bias due to factors such as photographic style of photographers and selection from dataset curators (Torralba, Efros, and others 2011). Using these biased datasets to train machine learning models for addressing different computer vision tasks naturally leads to the phenomenon of “bias in, bias out” and results in undesirable performance. Thus, to mitigate the fairness, accountability, and transparency concerns in computer vision, a crucial step is to start the entire pipeline with high-quality visual datasets that, at least, are authentic representations of the visual world. In other words, being able to detect potential biases hidden in the datasets prior to model development is a key step in guarding against unfair or untrustworthy outcomes in computer vision. While a few techniques have been developed to automatically detect dataset biases (Tramer et al. 2017), the nonstructured nature of visual data makes bias discovery in image datasets particularly challenging. This is because no human-comprehensive attributes can be directly leveraged

Measuring Social Biases of Crowd Workers using Counterfactual Queries

Crowdsourcing Detection of Sampling Biases in Image Datasets

Discovering Biases in Image Datasets with the Crowd

Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness

Fairness Preferences, Actual and Hypothetical: A Study of Crowdworker Incentives

Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

Towards causal benchmarking of bias in face analysis algorithms

Fairness and Bias in Truth Discovery Algorithms: An Experimental Analysis

Diverse Perspectives Can Mitigate Political Bias in Crowdsourced Content Moderation

Crowd-Selection Query Processing in Crowdsourcing Databases: A Task-Driven Approach.

Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics

Bayesian bias mitigation for crowdsourcing

Counterfactual Fair Opportunity: Measuring Decision Model Fairness with Counterfactual Reasoning

Privacy-preserving worker allocation in crowdsourcing

Uncovering Latent Biases in Text: Method and Application to Peer Review

How Crowd Worker Factors Influence Subjective Annotations: A Study of Tagging Misogynistic Hate Speech in Tweets

De-biasing "bias" measurement

Learning from Crowds in the Presence of Schools of Thought.

The Effect of Class Imbalance and Order on Crowdsourced Relevance Judgments

D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias

Evaluating and Mitigating Social Bias for Large Language Models in Open-ended Settings