PSN: Persian Social Norms Dataset for Cross-Cultural AI

Hamidreza Saffari,Mohammadamin Shafiei,Francesco Pierri
2024-06-16
Abstract:Datasets capturing cultural norms are essential for developing globally aware AI systems. We present Persian Social Norms (PSN) a novel dataset of over 1.7k Persian social norms, including environments, contexts, and cultural labels, alongside English translations. Leveraging large language models and prompt-engineering techniques, we generated potential norms that were reviewed by native speakers for quality and ethical compliance. As the first Persian dataset of its kind, this resource enables computational modeling of norm adaptation, a crucial challenge for cross-cultural AI informed by diverse cultural perspectives.
Social and Information Networks
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problem of insufficient understanding of Iranian social norms in cross - cultural AI systems. Specifically, the authors attempt to capture and record the cultural norms in Iranian society by creating a dataset named **Persian Social Norms (PSN)**. This dataset contains more than 1,700 manually - annotated Iranian social norms, covering different environments, situations and cultural labels, with English translations attached. #### Main problems include: 1. **Cultural Sensitivity of Cross - cultural AI Systems**: - Existing large - language models (LLMs) lack sufficient training data to understand the social norms of specific cultures when dealing with low - resource languages such as Persian. - This leads to the possibility of these models causing misunderstandings or spreading stereotypes in different cultural contexts. 2. **Uniqueness of Iranian Culture**: - Iranian culture combines modernism and traditionalism, making its social norms complex and difficult to accurately model with existing AI systems. - Persian, as a low - resource language, further increases the difficulty of aligning AI systems with Iranian culture. 3. **Dynamics and Diversity of Cultural Norms**: - The same behavior may have different levels of acceptance in different cultural backgrounds and social environments. For example, a behavior considered normal in some cultures may be regarded as taboo in others. - Therefore, a dataset that can reflect these diversities and changes is needed to help AI systems better understand and adapt to different cultures. #### Solutions: - **Constructing the PSN Dataset**: Use large - language models and prompt - engineering techniques to generate potential social norms, and have native speakers review and annotate them to ensure the quality and cultural accuracy of the data. - **Multi - label Classification**: Classify social norms into three categories: "Expected", "Normal" and "Taboo", in order to more finely describe the acceptance levels of different behaviors in specific environments. - **Cross - cultural Comparison**: By comparing with existing datasets of other cultures (such as NormBank), enhance the understanding and analysis of social behaviors on a global scale. In conclusion, this paper provides an important resource for developing more culturally - aware AI systems by creating the PSN dataset, especially in dealing with Iranian culture. This not only helps to improve the cultural sensitivity of AI systems, but also promotes cross - cultural communication and understanding.