FEED PETs: Further Experimentation and Expansion on the Disambiguation of Potentially Euphemistic Terms

Patrick Lee,Iyanuoluwa Shode,Alain Chirino Trujillo,Yuan Zhao,Olumide Ebenezer Ojo,Diana Cuevas Plancarte,Anna Feldman,Jing Peng
2023-06-07
Abstract:Transformers have been shown to work well for the task of English euphemism disambiguation, in which a potentially euphemistic term (PET) is classified as euphemistic or non-euphemistic in a particular context. In this study, we expand on the task in two ways. First, we annotate PETs for vagueness, a linguistic property associated with euphemisms, and find that transformers are generally better at classifying vague PETs, suggesting linguistic differences in the data that impact performance. Second, we present novel euphemism corpora in three different languages: Yoruba, Spanish, and Mandarin Chinese. We perform euphemism disambiguation experiments in each language using multilingual transformer models mBERT and XLM-RoBERTa, establishing preliminary results from which to launch future work.
Computation and Language
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper mainly explores two aspects: 1. **The impact of vagueness on the classification of Potential Euphemistic Terms (PETs)**: - The paper first investigates a linguistic attribute of euphemisms—vagueness. The authors manually annotated potential euphemistic terms into vague euphemistic terms (VETs) and non-vague euphemistic terms, and found that transformer models perform better when dealing with vague euphemistic terms. Although the specific reasons are not yet clear, this finding suggests that linguistic differences in the data may affect model performance. 2. **Multilingual euphemism recognition experiments**: - The paper further extends the task of euphemism recognition by creating new euphemism corpora in three different languages (Yoruba, Spanish, and Mandarin) and conducting preliminary experiments using multilingual transformer models (mBERT and XLM-RoBERTa). The experimental results provide a baseline for future research on multilingual and cross-linguistic euphemism processing. Through these studies, the paper aims to reveal the performance mechanisms of transformer models in euphemism recognition and explore their applicability in different languages.