EtiCor: Corpus for Analyzing LLMs for Etiquettes

Ashutosh Dwivedi,Pradhyumna Lavania,Ashutosh Modi
2023-10-29
Abstract:Etiquettes are an essential ingredient of day-to-day interactions among people. Moreover, etiquettes are region-specific, and etiquettes in one region might contradict those in other regions. In this paper, we propose EtiCor, an Etiquettes Corpus, having texts about social norms from five different regions across the globe. The corpus provides a test bed for evaluating LLMs for knowledge and understanding of region-specific etiquettes. Additionally, we propose the task of Etiquette Sensitivity. We experiment with state-of-the-art LLMs (Delphi, Falcon40B, and GPT-3.5). Initial results indicate that LLMs, mostly fail to understand etiquettes from regions from non-Western world.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of large language models (LLMs) in understanding the capabilities and limitations of social etiquette across different regions. Specifically: 1. **Introduction of the EtiCor Corpus**: The paper introduces EtiCor, a corpus that includes social norms from five major regions worldwide (East Asia, India, Middle East and Africa, North America and Europe, Latin America). This corpus aims to evaluate the understanding of large language models regarding social etiquette specific to these regions. 2. **Definition of Etiquette Sensitivity Task**: The task of "etiquette sensitivity" is proposed to assess whether a model can correctly determine if a given social norm is applicable to a specific region. 3. **Experimental Evaluation**: The paper conducts experimental evaluations of several state-of-the-art large language models (such as Delphi, Falcon-40B, and GPT-3.5) in a zero-shot setting and finds significant knowledge gaps in these models when dealing with etiquette from non-Western parts of the world. 4. **Model Fine-Tuning**: To further improve performance, the researchers also experimented with fine-tuning the BERT model and demonstrated its good performance across various regions through five-fold cross-validation. Overall, this study aims to reveal the shortcomings of existing large language models in recognizing global diverse etiquette and to pave the way for the development of more inclusive and culturally sensitive AI systems in the future.