Abstract:As Large Language Models (LLMs) ascend in popularity, offering information with unprecedented convenience compared to traditional search engines, we delve into the intriguing possibility that a new, singular perspective is being propagated. We call this the "Silent Curriculum," where our focus shifts towards a particularly impressionable demographic: children, who are drawn to the ease and immediacy of acquiring knowledge through these digital oracles. In this exploration, we delve into the sociocultural ramifications of LLMs, which, through their nuanced responses, may be subtly etching their own stereotypes, an algorithmic or AI monoculture. We hypothesize that the convergence of pre-training data, fine-tuning datasets, and analogous guardrails across models may have birthed a distinct cultural lens. We unpack this concept through a short experiment navigating children's storytelling, occupational-ethnic biases, and self-diagnosed annotations, to find that there exists strong cosine similarity (0.87) of biases across these models, suggesting a similar perspective of ethnic stereotypes in occupations. This paper invites a reimagining of LLMs' societal role, especially as the new information gatekeepers, advocating for a paradigm shift towards diversity-rich landscapes over unintended monocultures.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: When large language models (LLMs) provide information, do they inadvertently spread a single cultural perspective, which may have potential negative impacts on society, especially children? The author refers to this phenomenon as "Silent Curriculum" and explores how LLMs, through their training data, fine - tuning data sets, and similar protection mechanisms, may form a unique cultural lens, thereby affecting educational content and its accessibility. ### Specific problems include: 1. **The impact of LLM monoculture**: - Do LLMs form a unified cultural perspective through their training and optimization processes, which in turn affects the content they generate? - Will this unified cultural perspective inadvertently spread specific cultural biases or stereotypes? 2. **The impact on children**: - As digital natives, children are more likely to rely on LLMs to acquire knowledge. Will LLMs subtly influence children's cultural cognition and social concepts through the content they generate? - Will this influence form a "silent curriculum", that is, transmit certain cultural or social norms in an implicit and informal way? 3. **Cultural diversity and representation**: - When generating content, can LLMs fully reflect the cultural diversity around the world, or are they more inclined towards certain specific cultural groups? - How can we ensure that LLMs do not overly favor certain cultural groups and ignore the existence of other groups when generating content? ### Research methods: To explore these issues, the author conducted an empirical study, using two popular LLMs (GPT - 3.5 and LLaMA2 - 70B), and analyzed the cultural representation and occupational stereotypes in the content generated by LLMs through the following steps: 1. **Establishing an occupation - race bias benchmark**: - Prompt LLMs to draw inspiration from the WinoBias data set and create a corpus containing different races and the top 20 occupations. - By comparing the racial groups generated by the two LLMs, it was found that there was an overlap in seven racial groups between the two: white, black, Asian, Hispanic, Native American, Middle Eastern, and Latin American. 2. **Generating children's stories**: - Prompt LLMs to write short stories about children growing up to be successful in specific occupations without directly mentioning the children's races. - Analyze the implicit racial and geographical biases in the stories generated by LLMs. 3. **Self - labeling racial groups**: - Let LLMs automatically label the races of characters according to the names and places in the generated stories. - Through cosine similarity analysis, evaluate the consistency and deviation of LLMs in cultural representation. ### Conclusions: The research results show that LLMs do have significant cultural biases when generating content, manifested as a preference for certain racial groups and neglect of other groups. For example, LLMs are more likely to associate certain occupations with specific races, such as associating "software engineer" with Asians and "coffee grower" with Latin Americans. These findings remind us that although LLMs have powerful capabilities in natural language processing, their output is still influenced by potential social biases and cultural narratives. Therefore, the author calls on the academic and industrial communities to work together to challenge and broaden the single - cultural perspective of AI and ensure that multiple voices can shape the future AI - generated content.

The Silent Curriculum: How Does LLM Monoculture Shape Educational Content and Its Accessibility?

"They are uncultured": Unveiling Covert Harms and Social Threats in LLM Generated Conversations

The Life Cycle of Large Language Models: A Review of Biases in Education

LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education

How Are LLMs Mitigating Stereotyping Harms? Learning from Search Engine Studies

The life cycle of large language models in education: A framework for understanding sources of bias

From Bytes to Biases: Investigating the Cultural Self-Perception of Large Language Models

A Comprehensive Survey of Bias in LLMs: Current Landscape and Future Directions

Breaking Bias, Building Bridges: Evaluation and Mitigation of Social Biases in LLMs via Contact Hypothesis

"The teachers are confused as well": A Multiple-Stakeholder Ethics Discussion on Large Language Models in Computing Education

Whose ChatGPT? Unveiling Real-World Educational Inequalities Introduced by Large Language Models

Large Language Models Portray Socially Subordinate Groups as More Homogeneous, Consistent with a Bias Observed in Humans

Laissez-Faire Harms: Algorithmic Biases in Generative Language Models

The African Woman is Rhythmic and Soulful: An Investigation of Implicit Biases in LLM Open-ended Text Generation

Generative Monoculture in Large Language Models

With a Grain of SALT: Are LLMs Fair Across Social Dimensions?

Confronting LLMs with Traditional ML: Rethinking the Fairness of Large Language Models in Tabular Classifications

"Im not Racist but...": Discovering Bias in the Internal Knowledge of Large Language Models

The Unequal Opportunities of Large Language Models: Revealing Demographic Bias through Job Recommendations

Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data

Understanding Intrinsic Socioeconomic Biases in Large Language Models