Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs

Divyanshu Kumar,Umang Jain,Sahil Agarwal,Prashanth Harshangi
2024-10-13
Abstract:Large Language Models (LLMs) are being adopted across a wide range of tasks, including decision-making processes in industries where bias in AI systems is a significant concern. Recent research indicates that LLMs can harbor implicit biases even when they pass explicit bias evaluations. Building upon the frameworks of the LLM Implicit Association Test (IAT) Bias and LLM Decision Bias, this study highlights that newer or larger language models do not automatically exhibit reduced bias; in some cases, they displayed higher bias scores than their predecessors, such as in Meta's Llama series and OpenAI's GPT models. This suggests that increasing model complexity without deliberate bias mitigation strategies can unintentionally amplify existing biases. The variability in bias scores within and across providers underscores the need for standardized evaluation metrics and benchmarks for bias assessment. The lack of consistency indicates that bias mitigation is not yet a universally prioritized goal in model development, which can lead to unfair or discriminatory outcomes. By broadening the detection of implicit bias, this research provides a more comprehensive understanding of the biases present in advanced models and underscores the critical importance of addressing these issues to ensure the development of fair and responsible AI systems.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the issue of implicit bias in large language models (LLMs). Despite their excellent performance in various tasks, these models may carry implicit biases that cannot be detected through explicit bias assessments. The paper explores the relationship between model size, age, and implicit bias by conducting a large-scale study of over 50 large language models. It finds that newer or larger models do not necessarily reduce bias and may sometimes exhibit higher bias scores. This suggests that merely increasing model complexity without intentional bias mitigation strategies may inadvertently amplify existing biases. Specifically, the main contributions of the paper include: 1. Conducting large-scale experiments to investigate the implicit association test (IAT) bias and decision bias in over 50 large language models, confirming the presence of implicit bias in these models. 2. Analysis results show that newer large language models exhibit higher levels of bias, which researchers speculate may be due to the increased use of synthetic data in training data. Through these studies, the paper emphasizes the need for more rigorous bias detection and mitigation strategies in the development and deployment of large language models to ensure fair and responsible AI systems.