Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models
Yiran Liu,Ke Yang,Zehan Qi,Xiao Liu,Yang Yu,Chengxiang Zhai
2024-02-24
Abstract:The growing integration of large language models (LLMs) into social
operations amplifies their impact on decisions in crucial areas such as
economics, law, education, and healthcare, raising public concerns about these
models' discrimination-related safety and reliability. However, prior
discrimination measuring frameworks solely assess the average discriminatory
behavior of LLMs, often proving inadequate due to the overlook of an additional
discrimination-leading factor, i.e., the LLMs' prediction variation across
diverse contexts. In this work, we present the Prejudice-Caprice Framework
(PCF) that comprehensively measures discrimination in LLMs by considering both
their consistently biased preference and preference variation across diverse
contexts. Specifically, we mathematically dissect the aggregated contextualized
discrimination risk of LLMs into prejudice risk, originating from LLMs'
persistent prejudice, and caprice risk, stemming from their generation
inconsistency. In addition, we utilize a data-mining approach to gather
preference-detecting probes from sentence skeletons, devoid of attribute
indications, to approximate LLMs' applied contexts. While initially intended
for assessing discrimination in LLMs, our proposed PCF facilitates the
comprehensive and flexible measurement of any inductive biases, including
knowledge alongside prejudice, across various modality models. We apply our
discrimination-measuring framework to 12 common LLMs, yielding intriguing
findings: i) modern LLMs demonstrate significant pro-male stereotypes, ii)
LLMs' exhibited discrimination correlates with several social and economic
factors, iii) prejudice risk dominates the overall discrimination risk and
follows a normal distribution, and iv) caprice risk contributes minimally to
the overall risk but follows a fat-tailed distribution, suggesting that it is
wild risk requiring enhanced surveillance.
Computers and Society,Computation and Language