Unmasking the Stereotypes: Evaluating Social Biases in Chinese BERT

Sijing Zhang,Ping Li
DOI: https://doi.org/10.1109/ICNLP55136.2022.00059
2022-01-01
Abstract:Pretrained language models like BERT have great performances in various NLP tasks. However, recent researches have shown that language model can learn social biases from corpus. We designed a new bias evaluation experiment, and made the first attempt to evaluate social bias in Chinese BERT. Instead of pursuing absolute fairness in numbers, we compared BERT’s bias performances after fine-tuning on different contexts to study whether the model encode social biases. After adopting Predicted Likelihood Probability (PLP) and All Unmasked Likelihood (AUL) for bias measurement, we found original Chinese BERT reflects the real-world stereotypes of gender, like IT jobs are more suitable for male. We found Chinese BERT demonstrate significant biases, as with the increase of proportion of male-dominated sentences in the training set, BERT is more favorable to males after fine-tuning. Our result suggests that social biases are also encoded in Chinese language models, and one of the effective ways to mitigate bias is fine-tuning model on a gender-balanced corpus, although this method may not be robust enough.
What problem does this paper attempt to address?