Survey of Social Bias in Vision-Language Models

Nayeon Lee,Yejin Bang,Holy Lovenia,Samuel Cahyawijaya,Wenliang Dai,Pascale Fung
2023-09-24
Abstract:In recent years, the rapid advancement of machine learning (ML) models, particularly transformer-based pre-trained models, has revolutionized Natural Language Processing (NLP) and Computer Vision (CV) fields. However, researchers have discovered that these models can inadvertently capture and reinforce social biases present in their training datasets, leading to potential social harms, such as uneven resource allocation and unfair representation of specific social groups. Addressing these biases and ensuring fairness in artificial intelligence (AI) systems has become a critical concern in the ML community. The recent introduction of pre-trained vision-and-language (VL) models in the emerging multimodal field demands attention to the potential social biases present in these models as well. Although VL models are susceptible to social bias, there is a limited understanding compared to the extensive discussions on bias in NLP and CV. This survey aims to provide researchers with a high-level insight into the similarities and differences of social bias studies in pre-trained models across NLP, CV, and VL. By examining these perspectives, the survey aims to offer valuable guidelines on how to approach and mitigate social bias in both unimodal and multimodal settings. The findings and recommendations presented here can benefit the ML community, fostering the development of fairer and non-biased AI models in various applications and research endeavors.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily focuses on addressing the issue of social bias present in machine learning (ML) models. With the widespread application of Transformer-based pre-trained models in the fields of natural language processing (NLP) and computer vision (CV), researchers have found that these models may inadvertently capture and reinforce social biases present in their training datasets, leading to potential social harms such as unequal resource allocation and unfair representation of social groups. To tackle this issue, the paper aims to: 1. **Provide a comprehensive cross-domain perspective**: By conducting a comparative analysis of social bias research in NLP, CV, and visual language (VL) models, the paper offers researchers a high-level understanding. Although social bias research in the NLP and CV fields is relatively advanced, discussions related to VL models are comparatively scarce. 2. **Propose guiding recommendations**: Through the aforementioned analysis, the paper suggests specific methods for identifying, assessing, and mitigating social bias in both unimodal (such as text or image) and multimodal (combining vision and language) settings. These recommendations help promote the development of more fair and unbiased artificial intelligence systems. Specifically, the paper delves into gender bias and racial bias, categorizing and summarizing different types of bias metrics and mitigation methods to help readers better understand existing research findings and their limitations. Additionally, the paper emphasizes the importance of evaluating and mitigating bias in multimodal models and provides an outlook on future research directions. In summary, this review paper aims to push the machine learning community towards building more fair and inclusive AI systems.