Abstract:In recent years, the rapid advancement of machine learning (ML) models, particularly transformer-based pre-trained models, has revolutionized Natural Language Processing (NLP) and Computer Vision (CV) fields. However, researchers have discovered that these models can inadvertently capture and reinforce social biases present in their training datasets, leading to potential social harms, such as uneven resource allocation and unfair representation of specific social groups. Addressing these biases and ensuring fairness in artificial intelligence (AI) systems has become a critical concern in the ML community. The recent introduction of pre-trained vision-and-language (VL) models in the emerging multimodal field demands attention to the potential social biases present in these models as well. Although VL models are susceptible to social bias, there is a limited understanding compared to the extensive discussions on bias in NLP and CV. This survey aims to provide researchers with a high-level insight into the similarities and differences of social bias studies in pre-trained models across NLP, CV, and VL. By examining these perspectives, the survey aims to offer valuable guidelines on how to approach and mitigate social bias in both unimodal and multimodal settings. The findings and recommendations presented here can benefit the ML community, fostering the development of fairer and non-biased AI models in various applications and research endeavors.

What problem does this paper attempt to address?

The paper primarily focuses on addressing the issue of social bias present in machine learning (ML) models. With the widespread application of Transformer-based pre-trained models in the fields of natural language processing (NLP) and computer vision (CV), researchers have found that these models may inadvertently capture and reinforce social biases present in their training datasets, leading to potential social harms such as unequal resource allocation and unfair representation of social groups. To tackle this issue, the paper aims to: 1. **Provide a comprehensive cross-domain perspective**: By conducting a comparative analysis of social bias research in NLP, CV, and visual language (VL) models, the paper offers researchers a high-level understanding. Although social bias research in the NLP and CV fields is relatively advanced, discussions related to VL models are comparatively scarce. 2. **Propose guiding recommendations**: Through the aforementioned analysis, the paper suggests specific methods for identifying, assessing, and mitigating social bias in both unimodal (such as text or image) and multimodal (combining vision and language) settings. These recommendations help promote the development of more fair and unbiased artificial intelligence systems. Specifically, the paper delves into gender bias and racial bias, categorizing and summarizing different types of bias metrics and mitigation methods to help readers better understand existing research findings and their limitations. Additionally, the paper emphasizes the importance of evaluating and mitigating bias in multimodal models and provides an outlook on future research directions. In summary, this review paper aims to push the machine learning community towards building more fair and inclusive AI systems.

Survey of Social Bias in Vision-Language Models

Mapping Bias in Vision Language Models: Signposts, Pitfalls, and the Road Ahead

Bias and Fairness in Large Language Models: A Survey

MultiModal Bias: Introducing a Framework for Stereotypical Bias Assessment beyond Gender and Race in Vision Language Models

Tackling Bias in Pre-trained Language Models: Current Trends and Under-represented Societies

Debiasing Methods for Fairer Neural Models in Vision and Language Research: A Survey

Counterfactually Measuring and Eliminating Social Bias in Vision-Language Pre-training Models

A Unified Framework and Dataset for Assessing Societal Bias in Vision-Language Models

Fairness and Bias in Multimodal AI: A Survey

Fairness in Deep Learning: A Survey on Vision and Language Research

A Multidimensional Analysis of Social Biases in Vision Transformers

Probing Intersectional Biases in Vision-Language Models with Counterfactual Examples

Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals

Examining Gender and Racial Bias in Large Vision-Language Models Using a Novel Dataset of Parallel Images

VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model

BiasDora: Exploring Hidden Biased Associations in Vision-Language Models

Uncovering Bias in Large Vision-Language Models with Counterfactuals

Towards Understanding and Mitigating Social Biases in Language Models

Evaluating Fairness in Large Vision-Language Models Across Diverse Demographic Attributes and Prompts

See It from My Perspective: Diagnosing the Western Cultural Bias of Large Vision-Language Models in Image Understanding