Speciesist language and nonhuman animal bias in English Masked Language Models

Masashi Takeshita,Rafal Rzepka,Kenji Araki,Masashi Takeshita,Rafal Rzepka,Kenji Araki
DOI: https://doi.org/10.1016/j.ipm.2022.103050
2022-09-01
Abstract:Warning: This paper contains examples of offensive language, including insulting or objectifying expressions.Various existing studies have analyzed what social biases are inherited by NLP models. These biases may directly or indirectly harm people, therefore previous studies have focused only on human attributes. However, until recently no research on social biases in NLP regarding nonhumans existed. In this paper,11Anonymous previous version of this paper is accessible at https://openreview.net/forum?id=dfqMpjZOgv4, we have also published a paper on this topic in Japanese (Takeshita et al., 2021). we analyze biases to nonhuman animals, i.e. speciesist bias, inherent in English Masked Language Models such as BERT. We analyzed speciesist bias against 46 animal names using template-based and corpus-extracted sentences containing speciesist (or non-speciesist) language. We found that pre-trained masked language models tend to associate harmful words with nonhuman animals and have a bias toward using speciesist language for some nonhuman animal names. Our code for reproducing the experiments will be made available on GitHub.22https://github.com/Language-Media-Lab/speciesist-language.
computer science, information systems,information science & library science
What problem does this paper attempt to address?