Cultural Value Differences of LLMs: Prompt, Language, and Model Size

Qishuai Zhong,Yike Yun,Aixin Sun

2024-06-17

Abstract:Our study aims to identify behavior patterns in cultural values exhibited by large language models (LLMs). The studied variants include question ordering, prompting language, and model size. Our experiments reveal that each tested LLM can efficiently behave with different cultural values. More interestingly: (i) LLMs exhibit relatively consistent cultural values when presented with prompts in a single language. (ii) The prompting language e.g., Chinese or English, can influence the expression of cultural values. The same question can elicit divergent cultural values when the same LLM is queried in a different language. (iii) Differences in sizes of the same model (e.g., Llama2-7B vs 13B vs 70B) have a more significant impact on their demonstrated cultural values than model differences (e.g., Llama2 vs Mixtral). Our experiments reveal that query language and model size of LLM are the main factors resulting in cultural value differences.

Computers and Society,Computation and Language

What problem does this paper attempt to address?

The paper aims to explore the behavioral patterns of large language models (LLMs) in expressing cultural values under different contexts. Specifically, the study focuses on the following aspects: 1. **Prompt Variations within a Single Language**: The study finds that when LLMs are prompted differently within the same language, the models exhibit relatively consistent behavior in expressing cultural values despite changes in prompt content. However, changes in the position of options can affect the model's responses. 2. **Impact of Language**: The research indicates that the language of the prompt (e.g., Chinese vs. English) significantly influences the cultural values expressed by the model. Even the same question can elicit different expressions of cultural values when queried in different languages. 3. **Impact of Model Size**: The differences in cultural values expressed by different sizes of the same series of models (e.g., Llama2-7B, 13B, 70B) are more significant than the differences between models of different series. The paper uses Hofstede's cultural values survey (VSM) to quantify and compare the cultural values expressed by the models under different conditions. It finds that the query language and model size are the main factors leading to differences in cultural values. Additionally, the paper explores the sensitivity of models to option ordering when processing text and the consistency of model behavior in a multilingual context. Overall, the study reveals the differences in cultural values exhibited by language models across languages and scales, as well as the underlying reasons.

Cultural Value Differences of LLMs: Prompt, Language, and Model Size

Are Large Language Models Consistent over Value-laden Questions?

How Well Do LLMs Represent Values Across Cultures? Empirical Analysis of LLM Responses Based on Hofstede Cultural Dimensions

Do LLMs have Consistent Values?

Are Large Language Models Chameleons? An Attempt to Simulate Social Surveys

Large Language Models as Superpositions of Cultural Perspectives

Exploring Large Language Models on Cross-Cultural Values in Connection with Training Methodology

LLMs as mirrors of societal moral standards: reflection of cultural divergence and agreement across ethical topics

Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions

Investigating Cultural Alignment of Large Language Models

Stick to your role! Stability of personal values expressed in large language models

CulturalBench: a Robust, Diverse and Challenging Benchmark on Measuring the (Lack of) Cultural Knowledge of LLMs

Language Models Show Stable Value Orientations Across Diverse Role-Plays

CultureLLM: Incorporating Cultural Differences into Large Language Models

How Well Do LLMs Identify Cultural Unity in Diversity?

Exploring Value Biases: How LLMs Deviate Towards the Ideal

Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting Prompts

Ethical Reasoning and Moral Value Alignment of LLMs Depend on the Language we Prompt them in

CULTURE-GEN: Revealing Global Cultural Perception in Language Models through Natural Language Prompting

Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ

Dissecting Human and LLM Preferences