Gender Bias in LLM-generated Interview Responses

Haein Kong,Yongsu Ahn,Sangyub Lee,Yunho Maeng
2024-10-30
Abstract:LLMs have emerged as a promising tool for assisting individuals in diverse text-generation tasks, including job-related texts. However, LLM-generated answers have been increasingly found to exhibit gender bias. This study evaluates three LLMs (GPT-3.5, GPT-4, Claude) to conduct a multifaceted audit of LLM-generated interview responses across models, question types, and jobs, and their alignment with two gender stereotypes. Our findings reveal that gender bias is consistent, and closely aligned with gender stereotypes and the dominance of jobs. Overall, this study contributes to the systematic examination of gender bias in LLM-generated interview responses, highlighting the need for a mindful approach to mitigate such biases in related applications.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the phenomenon of gender bias in the interview responses generated by large - language models (LLMs). Specifically, the researchers evaluated three different LLMs (GPT - 3.5, GPT - 4, Claude). They audited the performance of these models on different types of interview questions and positions in multiple dimensions and their alignment with two known gender stereotypes. The study found that gender bias consistently exists in the interview responses generated by LLMs, and this bias is closely related to gender stereotypes and occupational dominance, indicating that existing views are reinforced in the content generated by LLMs. Overall, this study contributes to the systematic examination of gender bias in the interview content generated by LLMs and emphasizes the necessity of taking a prudent approach to mitigate such bias in relevant applications.