Diminishing Stereotype Bias in Image Generation Model using Reinforcemenlent Learning Feedback

Xin Chen,Virgile Foussereau

2024-06-28

Abstract:This study addresses gender bias in image generation models using Reinforcement Learning from Artificial Intelligence Feedback (RLAIF) with a novel Denoising Diffusion Policy Optimization (DDPO) pipeline. By employing a pretrained stable diffusion model and a highly accurate gender classification Transformer, the research introduces two reward functions: Rshift for shifting gender imbalances, and Rbalance for achieving and maintaining gender balance. Experiments demonstrate the effectiveness of this approach in mitigating bias without compromising image quality or requiring additional data or prompt modifications. While focusing on gender bias, this work establishes a foundation for addressing various forms of bias in AI systems, emphasizing the need for responsible AI development. Future research directions include extending the methodology to other bias types, enhancing the RLAIF pipeline's robustness, and exploring multi-prompt fine-tuning to further advance fairness and inclusivity in AI.

Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The paper aims to address the issue of stereotype bias in image generation models, particularly gender bias. With the advancement of image generation technology, the quality of synthetic images has reached a level that is difficult to distinguish from real images. However, this also brings ethical challenges, especially as models may amplify social stereotypes such as gender and race. The paper proposes a new method based on Reinforcement Learning from Artificial Intelligence Feedback (RLAIF) to reduce gender bias by fine-tuning pre-trained diffusion models without the need for additional data or hard modifications to the prompts. Specifically, the paper introduces two reward functions: Rshift and Rbalance. Rshift is used to quickly adjust gender imbalance in a few fine-tuning steps, while Rbalance is used to achieve and maintain gender balance in generated images. Experimental results show that this method can effectively reduce gender bias without sacrificing image quality and performs well on multiple occupation-related prompts. Additionally, the paper explores how improving trust region constraints can enhance the stability of the fine-tuning process. Overall, this paper provides a new solution for reducing gender bias in image generation models and emphasizes the importance of responsible AI development.

Diminishing Stereotype Bias in Image Generation Model using Reinforcemenlent Learning Feedback

Balancing the Scales: Reinforcement Learning for Fair Classification

Mitigate Bias in Face Recognition using Skewness-Aware Reinforcement Learning

AI-generated faces influence gender stereotypes and racial homogenization

New Job, New Gender? Measuring the Social Bias in Image Generation Models

Stable Diffusion Exposed: Gender Bias from Prompt to Image

Pixel-wise RL on Diffusion Models: Reinforcement Learning from Rich Feedback

Gender Bias Evaluation in Text-to-image Generation: A Survey

Gender Slopes: Counterfactual Fairness for Computer Vision Models by Attribute Manipulation

Stable Bias: Analyzing Societal Representations in Diffusion Models

Fairness in AI Systems: Mitigating gender bias from language-vision models

Deep Generative Views to Mitigate Gender Classification Bias Across Gender-Race Groups

REFINE-LM: Mitigating Language Model Stereotypes via Reinforcement Learning

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

Exploring Biases and Prejudice of Facial Synthesis via Semantic Latent Space

Debiasing Gender Bias in Information Retrieval Models

Large-scale Reinforcement Learning for Diffusion Models

Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models

Bias in Generative AI

Reward Incremental Learning in Text-to-Image Generation

Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness