Abstract:Text-to-Image (TTI) generative models have shown great progress in the past few years in terms of their ability to generate complex and high-quality imagery. At the same time, these models have been shown to suffer from harmful biases, including exaggerated societal biases (e.g., gender, ethnicity), as well as incidental correlations that limit such a model's ability to generate more diverse imagery. In this paper, we propose a general approach to study and quantify a broad spectrum of biases, for any TTI model and for any prompt, using counterfactual reasoning. Unlike other works that evaluate generated images on a predefined set of bias axes, our approach automatically identifies potential biases that might be relevant to the given prompt, and measures those biases. In addition, we complement quantitative scores with post-hoc explanations in terms of semantic concepts in the images generated. We show that our method is uniquely capable of explaining complex multi-dimensional biases through semantic concepts, as well as the intersectionality between different biases for any given prompt. We perform extensive user studies to illustrate that the results of our method and analysis are consistent with human judgements.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is a series of bias problems existing in text - to - image generation models (Text - to - Image, TTI). Specifically, these biases include social biases (such as gender, race, etc.) and accidental correlations (for example, programmers are usually depicted as men wearing glasses), and these biases limit the model's ability to generate more diverse images. Existing evaluation methods usually rely on predefined bias axes, which may lead to underestimation or neglect of certain biases. Therefore, this paper proposes a new framework, TIBET (Text to Image Bias Evaluation Tool), which aims to dynamically identify and quantify a wide range of biases in any TTI model and under any input prompt, and provide concept - based explanations to enhance the understanding of biases in generated images. ### Main Contributions 1. **Dynamic Bias Axis Identification and Measurement**: Different from existing methods that mainly focus on predefined bias axes, TIBET can dynamically generate relevant bias axes according to input prompts, thus evaluating biases more flexibly. 2. **New Quantitative Indicator CAS**: A new quantitative indicator - Concept Association Score (CAS) is proposed to quantify biases and provide posterior explanations for different bias dimensions. 3. **Intersection Analysis of Multidimensional Biases**: TIBET can not only detect and quantify biases on a single bias axis, but also explore the intersection relationships between different bias axes, providing a more comprehensive bias analysis. 4. **Combination with Bias Mitigation Techniques**: Experiments show that TIBET can be combined with bias mitigation techniques (such as ITI - GEN) to further improve the fairness of TTI models. ### Method Overview 1. **Dynamic Bias Axis Generation**: Use large - language models (such as GPT - 3) to generate relevant bias axes according to input prompts. 2. **Counterfactual Prompt Generation**: Generate counterfactual prompts for each bias axis for comparative analysis. 3. **Image Generation**: Use a black - box TTI model to generate image sets for initial prompts and counterfactual prompts. 4. **Image Comparison**: - **Concept Extraction Based on VQA**: Use a visual question - answering (VQA) model to extract concepts in images and calculate the Concept Association Score (CAS). - **Based on Visual - Language Embedding Model**: Use the CLIP model to directly embed images and calculate the cosine similarity to obtain CASCLIP. 5. **Bias Evaluation Indicators**: - **Mean Absolute Deviation (MAD)**: Used to quantify the degree of bias on the bias axis. - **Qualitative Indicator**: Provide concept - based explanations, such as Top - K concepts and axis - aligned Top - K concepts. ### Experimental Results 1. **Qualitative Results**: Through specific examples, the effect of TIBET in analyzing biases in generated images is demonstrated, including cultural biases, gender biases, and facial expression biases, etc. 2. **Gender Stereotypes in Occupations**: The gender biases under 11 occupation prompts are evaluated. The results show that TIBET can effectively detect and quantify gender biases, and in some occupations, the biases are significantly mitigated after using ITI - GEN. 3. **Human Evaluation**: The consistency between the results of TIBET and human judgments is verified through user studies. In conclusion, TIBET provides a comprehensive and flexible method to identify, quantify, and explain biases in TTI models, which helps to improve the fairness and diversity of these models.

TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models

Quantifying Bias in Text-to-Image Generative Models

Survey of Bias In Text-to-Image Generation: Definition, Evaluation, and Mitigation

BIGbench: A Unified Benchmark for Social Bias in Text-to-Image Generative Models Based on Multi-modal LLM

Discovering Biases in Image Datasets with the Crowd

A Taxonomy of the Biases of the Images created by Generative Artificial Intelligence

Text-to-Image Representativity Fairness Evaluation Framework

Diversified text-to-image generation via deep mutual information estimation

Analyzing Quality, Bias, and Performance in Text-to-Image Generative Models

Manipulating and Mitigating Generative Model Biases without Retraining

T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation

Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models

GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models

Word-Level Explanations for Analyzing Bias in Text-to-Image Models

Mitigating Social Biases in Text-to-Image Diffusion Models Via Linguistic-Aligned Attention Guidance

Stable Bias: Analyzing Societal Representations in Diffusion Models

Navigating Text-to-Image Generative Bias across Indic Languages

Bias Begets Bias: The Impact of Biased Embeddings on Diffusion Models

Exploring Social Bias in Downstream Applications of Text-to-Image Foundation Models

Gender Bias Evaluation in Text-to-image Generation: A Survey