Abstract:Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life. Given the recent popularity and adoption of language generation technologies, the potential to further marginalize this population only grows. Although a multitude of NLP fairness literature focuses on illuminating and addressing gender biases, assessing gender harms for TGNB identities requires understanding how such identities uniquely interact with societal gender norms and how they differ from gender binary-centric perspectives. Such measurement frameworks inherently require centering TGNB voices to help guide the alignment between gender-inclusive NLP and whom they are intended to serve. Towards this goal, we ground our work in the TGNB community and existing interdisciplinary literature to assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation (OLG). This social knowledge serves as a guide for evaluating popular large language models (LLMs) on two key aspects: (1) misgendering and (2) harmful responses to gender disclosure. To do this, we introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community. We discover a dominance of binary gender norms reflected by the models; LLMs least misgendered subjects in generated text when triggered by prompts whose subjects used binary pronouns. Meanwhile, misgendering was most prevalent when triggering generation with singular they and neopronouns. When prompted with gender disclosures, TGNB disclosure generated the most stigmatizing language and scored most toxic, on average. Our findings warrant further research on how TGNB harms manifest in LLMs and serve as a broader case study toward concretely grounding the design of gender-inclusive AI in community voices and interdisciplinary literature.

What problem does this paper attempt to address?

### Problems the Paper Aims to Address This paper aims to address the issues of misgendering and harmful responses towards Transgender and Non-Binary (TGNB) individuals in Open Language Generation (OLG). Specifically, the paper focuses on the following aspects: 1. **Misgendering**: - Investigate whether large language models (LLMs) exhibit misgendering when generating text, i.e., using incorrect pronouns to refer to TGNB individuals. - Evaluate the performance of different types of pronouns (including binary pronouns, singular they, and neopronouns like ey, xe, fae) in model generation. 2. **Harmful Responses to Gender Disclosure**: - Examine whether models generate discriminatory or negative language when prompted with gender disclosure. - Assess the models' responses to gender disclosure information to ensure that such information does not lead to further marginalization or harm. ### Research Background - **Social Reality**: TGNB individuals often face discrimination and exclusion in daily life, a phenomenon that could be amplified in language generation technologies. - **Existing Research**: While fairness research in the field of Natural Language Processing (NLP) has addressed gender bias, specific studies focusing on TGNB individuals are relatively scarce. - **Community Voices**: The paper emphasizes the need to center the voices of the TGNB community to understand their unique social realities, thereby guiding the design of gender-inclusive NLP. ### Methodology 1. **Dataset Construction**: - **TANGO Dataset**: Comprises two parts, one with 2,880 prompts for evaluating misgendering and another with 1,422,720 templates for evaluating responses to gender disclosure. - The dataset is built from real texts on Nonbinary Wiki, covering various pronouns and forms of gender disclosure. 2. **Model Evaluation**: - Four popular large language models (GPT-2, GPT-Neo, OPT, ChatGPT) are used for evaluation. - A combination of automated tools and human annotation is employed to assess misgendering and harmful responses in the generated text. ### Key Findings - **Misgendering**: - Models exhibit the most severe misgendering when handling pronouns commonly used by TGNB individuals (such as singular they and neopronouns). - Binary pronouns (he/she) show higher gender consistency, while neopronouns show lower gender consistency. - Some models face grammatical recognition difficulties when dealing with neopronouns. - **Responses to Gender Disclosure**: - When models receive gender disclosure prompts, the generated text often contains negative emotions and discriminatory language. - Responses to binary gender disclosures are relatively less harmful. ### Conclusions and Recommendations - **Conclusions**: Current large language models exhibit significant issues with misgendering and harmful responses when handling gender information of TGNB individuals. - **Recommendations**: Future research should further explore how to better incorporate the voices of the TGNB community and interdisciplinary literature in designing gender-inclusive AI to reduce misgendering and harmful responses. Through this research, the paper aims to advance the development of natural language generation technologies to be more inclusive and respectful of all gender identities.

"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models

Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies

Laissez-Faire Harms: Algorithmic Biases in Generative Language Models

Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies

Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias

MISGENDERED: Limits of Large Language Models in Understanding Pronouns

Non-Binary Gender Expression in Online Interactions

QueerBench: Quantifying Discrimination in Language Models Toward Queer Identities

''Fifty Shades of Bias'': Normative Ratings of Gender Bias in GPT Generated English Text

Gender Bias in Large Language Models across Multiple Languages

From Bytes to Biases: Investigating the Cultural Self-Perception of Large Language Models

Learning from Red Teaming: Gender Bias Provocation and Mitigation in Large Language Models

A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation

Exploration, detection, and mitigation: Unveiling gender bias in NLP

BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation

Toward Gender-Inclusive Coreference Resolution: An Analysis of Gender and Bias Throughout the Machine Learning Lifecycle

Public Perceptions of Gender Bias in Large Language Models: Cases of ChatGPT and Ernie

Mitigating Bias in Queer Representation within Large Language Models: A Collaborative Agent Approach

Inclusivity in Large Language Models: Personality Traits and Gender Bias in Scientific Abstracts

Aligning with Whom? Large Language Models Have Gender and Racial Biases in Subjective NLP Tasks