Abstract:Large language models (LLMs) reflect societal norms and biases, especially about gender. While societal biases and stereotypes have been extensively researched in various NLP applications, there is a surprising gap for emotion analysis. However, emotion and gender are closely linked in societal discourse. E.g., women are often thought of as more empathetic, while men's anger is more socially accepted. To fill this gap, we present the first comprehensive study of gendered emotion attribution in five state-of-the-art LLMs (open- and closed-source). We investigate whether emotions are gendered, and whether these variations are based on societal stereotypes. We prompt the models to adopt a gendered persona and attribute emotions to an event like 'When I had a serious argument with a dear person'. We then analyze the emotions generated by the models in relation to the gender-event pairs. We find that all models consistently exhibit gendered emotions, influenced by gender stereotypes. These findings are in line with established research in psychology and gender studies. Our study sheds light on the complex societal interplay between language, gender, and emotion. The reproduction of emotion stereotypes in LLMs allows us to use those models to study the topic in detail, but raises questions about the predictive use of those same LLMs for emotion applications.

What problem does this paper attempt to address?

The paper primarily explores the reflection of social gender stereotypes in large language models (LLMs) during emotion attribution tasks. Specifically, the researchers focus on whether these advanced language models exhibit gender-based biases when describing emotions experienced by different genders. The main contributions of the paper are as follows: 1. This is the first systematic study of five state-of-the-art large language models to examine whether they exhibit gender bias in emotion attribution. 2. The researchers combined the role-playing capabilities of large language models with events from the International Survey on Emotion Antecedents and Reactions (ISEAR) dataset to handle the emotion attribution task. 3. The paper provides a quantitative analysis based on over 200,000 completions generated by the five models, covering over 7,000 events and two roles (male and female), including more than 400 unique emotions. 4. The paper also conducts qualitative analysis to explore the explanations generated by the models. The experiments revealed that all tested models consistently exhibited gendered emotional tendencies influenced by social gender stereotypes. For example, the models tended to associate sadness (SADNESS) with females and anger (ANGER) with males. However, when comparing the outputs of these models with the gender and emotions reported in the ISEAR dataset, these associations did not align with the actual emotional experiences of males and females. This raises questions about how to use these language models for emotion-related applications. In summary, the paper attempts to address the question: Do large language models reflect gendered stereotypes when handling emotion attribution tasks? And it demonstrates this through a series of quantitative and qualitative analyses.

Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution

Gender bias and stereotypes in Large Language Models

An Empirical Study of Gendered Stereotypes in Emotional Attributes for Bangla in Multilingual Large Language Models

Evaluation of Large Language Models: STEM education and Gender Stereotypes

Inclusivity in Large Language Models: Personality Traits and Gender Bias in Scientific Abstracts

Analyzing Cultural Representations of Emotions in LLMs through Mixed Emotion Survey

Gender Bias in Decision-Making with Large Language Models: A Study of Relationship Conflicts

Gender Bias in Large Language Models across Multiple Languages

Evaluating Gender Bias of LLMs in Making Morality Judgements

Unveiling Gender Bias in Large Language Models: Using Teacher's Evaluation in Higher Education As an Example

Assessing Gender Bias in LLMs: Comparing LLM Outputs with Human Perceptions and Official Statistics

Divine LLaMAs: Bias, Stereotypes, Stigmatization, and Emotion Representation of Religion in Large Language Models

Public Perceptions of Gender Bias in Large Language Models: Cases of ChatGPT and Ernie

Multilingual Language Models are not Multicultural: A Case Study in Emotion

Adaptable Moral Stances of Large Language Models on Sexist Content: Implications for Society and Gender Discourse

Large Language Models Portray Socially Subordinate Groups as More Homogeneous, Consistent with a Bias Observed in Humans

What an Elegant Bridge: Multilingual LLMs are Biased Similarly in Different Languages

Do Large Language Models Possess Sensitive to Sentiment?

Evaluating LLMs for Gender Disparities in Notable Persons

White Men Lead, Black Women Help? Benchmarking Language Agency Social Biases in LLMs