Visual Stereotypes of Autism Spectrum in DALL-E, Stable Diffusion, SDXL, and Midjourney

Maciej Wodziński,Marcin Rządeczka,Anastazja Szuła,Marta Sokół,Marcin Moskalewicz
2024-07-24
Abstract:Avoiding systemic discrimination requires investigating AI models' potential to propagate stereotypes resulting from the inherent biases of training datasets. Our study investigated how text-to-image models unintentionally perpetuate non-rational beliefs regarding autism. The research protocol involved generating images based on 53 prompts aimed at visualizing concrete objects and abstract concepts related to autism across four models: DALL-E, Stable Diffusion, SDXL, and Midjourney (N=249). Expert assessment of results was performed via a framework of 10 deductive codes representing common stereotypes contested by the community regarding their presence and spatial intensity, quantified on ordinal scales and subject to statistical analysis of inter-rater reliability and size effects. The models frequently utilised controversial themes and symbols which were unevenly distributed, however, with striking homogeneity in terms of skin colour, gender, and age, with autistic individuals portrayed as engaged in solitary activities, interacting with objects rather than people, and displaying stereotypical emotional expressions such as pale, anger, or sad. Secondly we observed representational insensitivity regarding autism images despite directional prompting aimed at falsifying the above results. Additionally, DALL-E explicitly denied perpetuating stereotypes. We interpret this as ANNs mirroring the human cognitive architecture regarding the discrepancy between background and reflective knowledge, as justified by our previous research on autism-related stereotypes in humans.
Computers and Society,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily explores how text-to-image AI models inadvertently perpetuate social stereotypes about autism spectrum disorder (ASD) when generating images related to autism. The study attempts to address the issue through the following points: 1. **Research Background**: As AI models become significant sources of knowledge and perspectives, it is crucial to analyze the cognitive biases within these models and their oversimplified representations of various social phenomena. Avoiding systemic discrimination requires investigating the stereotypes that AI models may propagate due to inherent biases in the training data. 2. **Research Objectives**: This paper aims to study how four different text-to-image models—DALL-E, Stable Diffusion, SDXL, and Midjourney—reflect prevalent social stereotypes when generating images based on autism-related prompts. 3. **Methodology**: The researchers designed a study protocol with 53 prompts to visualize specific objects and abstract concepts related to autism. These prompts were input into the four models, each generating multiple images based on the prompts. Experts used a framework with 10 inferential codes to evaluate the generated images, representing common stereotypes contested by the autism community, and quantified their presence and intensity. 4. **Findings**: - The models frequently used controversial themes and symbols, showing remarkable homogeneity in skin color, gender, and age, despite uneven distribution across different models. - Individuals with autism were often depicted as engaging in solitary activities, interacting with objects rather than people, and displaying typical negative emotional expressions such as pallor, anger, or sadness. - Despite researchers' attempts to counteract these results with directed prompts, insensitivity in representation was still observed. - DALL-E explicitly denied its role in perpetuating stereotypes. 5. **Conclusion**: The study reveals how text-to-image models reflect social stereotypes related to autism and discusses the implications of these findings for understanding human cognitive structures. Additionally, the paper highlights the limitations of current "fairness protocols," which only temporarily address the issue without fundamentally resolving the biases present in the training data.