Abstract:With the advent of Large Language Models (LLMs) possessing increasingly impressive capabilities, a number of Large Vision-Language Models (LVLMs) have been proposed to augment LLMs with visual inputs. Such models condition generated text on both an input image and a text prompt, enabling a variety of use cases such as visual question answering and multimodal chat. While prior studies have examined the social biases contained in text generated by LLMs, this topic has been relatively unexplored in LVLMs. Examining social biases in LVLMs is particularly challenging due to the confounding contributions of bias induced by information contained across the text and visual modalities. To address this challenging problem, we conduct a large-scale study of text generated by different LVLMs under counterfactual changes to input images. Specifically, we present LVLMs with identical open-ended text prompts while conditioning on images from different counterfactual sets, where each set contains images which are largely identical in their depiction of a common subject (e.g., a doctor), but vary only in terms of intersectional social attributes (e.g., race and gender). We comprehensively evaluate the text produced by different models under this counterfactual generation setting at scale, producing over 57 million responses from popular LVLMs. Our multi-dimensional analysis reveals that social attributes such as race, gender, and physical characteristics depicted in input images can significantly influence the generation of toxic content, competency-associated words, harmful stereotypes, and numerical ratings of depicted individuals. We additionally explore the relationship between social bias in LVLMs and their corresponding LLMs, as well as inference-time strategies to mitigate bias.

Selection Bias Induced Spurious Correlations in Large Language Models

UnMASKed: Quantifying Gender Biases in Masked Language Models through Linguistically Informed Job Market Prompts

Gender Bias in Large Language Models across Multiple Languages

Gender bias and stereotypes in Large Language Models

The Unequal Opportunities of Large Language Models: Revealing Demographic Bias through Job Recommendations

Underspecification in Language Modeling Tasks: A Causality-Informed Study of Gendered Pronoun Resolution

Interpreting Bias in Large Language Models: A Feature-Based Approach

Gender Bias in Decision-Making with Large Language Models: A Study of Relationship Conflicts

Uncovering Bias in Large Vision-Language Models with Counterfactuals

Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals

Locating and Mitigating Gender Bias in Large Language Models

Understanding Intrinsic Socioeconomic Biases in Large Language Models

Fewer Errors, but More Stereotypes? The Effect of Model Size on Gender Bias

Aligning with Whom? Large Language Models Have Gender and Racial Biases in Subjective NLP Tasks

Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias

Generative Language Models Exhibit Social Identity Biases

Evaluation of Large Language Models: STEM education and Gender Stereotypes

Learning from Red Teaming: Gender Bias Provocation and Mitigation in Large Language Models

Unveiling and Mitigating Bias in Mental Health Analysis with Large Language Models

Towards detecting unanticipated bias in Large Language Models