What makes your model a low-empathy or warmth person: Exploring the Origins of Personality in LLMs

Shu Yang,Shenzhe Zhu,Ruoxuan Bao,Liang Liu,Yu Cheng,Lijie Hu,Mengdi Li,Di Wang

2024-10-08

Abstract:Large language models (LLMs) have demonstrated remarkable capabilities in generating human-like text and exhibiting personality traits similar to those in humans. However, the mechanisms by which LLMs encode and express traits such as agreeableness and impulsiveness remain poorly understood. Drawing on the theory of social determinism, we investigate how long-term background factors, such as family environment and cultural norms, interact with short-term pressures like external instructions, shaping and influencing LLMs' personality traits. By steering the output of LLMs through the utilization of interpretable features within the model, we explore how these background and pressure factors lead to changes in the model's traits without the need for further fine-tuning. Additionally, we suggest the potential impact of these factors on model safety from the perspective of personality.

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

The paper attempts to address the following issues: Large Language Models (LLMs) have demonstrated the ability to generate human-like text and exhibit human-like personality traits, such as agreeableness and impulsivity. However, the mechanisms by which these models encode and express these personality traits remain unclear. Based on the theory of social determinism, the paper explores how long-term background factors (such as family environment and cultural norms) and short-term pressures (such as external instructions) interact to shape and influence the personality traits of LLMs. Additionally, the study investigates how these factors guide changes in model outputs through interpretable features without further fine-tuning. Finally, the paper discusses the impact of these factors on model safety from a personality perspective. Specifically, the paper focuses on the following two core questions: 1. How do long-term background factors and short-term pressures shape and influence the personality traits of LLMs? Why do LLMs exhibit personality traits similar to low empathy or warmth? 2. How do these personality traits affect the safety of LLMs? For example, does higher agreeableness make LLMs more susceptible to jailbreak attacks? By exploring these questions, the paper aims to reveal the mechanisms behind the formation of LLMs' personality traits and propose methods to control and adjust these traits to improve the model's safety and reliability.

What makes your model a low-empathy or warmth person: Exploring the Origins of Personality in LLMs

Editing Personality for LLMs

PersLLM: A Personified Training Approach for Large Language Models

Self-assessment, Exhibition, and Recognition: a Review of Personality in Large Language Models

Identifying Multiple Personalities in Large Language Models with External Evaluation

Personality Traits in Large Language Models

Humanity in AI: Detecting the Personality of Large Language Models

Eliciting Personality Traits in Large Language Models

Eliciting Big Five Personality Traits in Large Language Models: A Textual Analysis with Classifier-Driven Approach

PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits

The Dark Patterns of Personalized Persuasion in Large Language Models: Exposing Persuasive Linguistic Features for Big Five Personality Traits in LLMs Responses

Have Large Language Models Developed a Personality?: Applicability of Self-Assessment Tests in Measuring Personality in LLMs

Extroversion or Introversion? Controlling The Personality of Your Large Language Models

LMLPA: Language Model Linguistic Personality Assessment

Tailoring Personality Traits in Large Language Models via Unsupervisedly-Built Personalized Lexicons

Modeling Human Subjectivity in LLMs Using Explicit and Implicit Human Factors in Personas

Is Self-knowledge and Action Consistent or Not: Investigating Large Language Model's Personality

Personality testing of Large Language Models: Limited temporal stability, but highlighted prosociality

Identifying and Manipulating Personality Traits in LLMs Through Activation Engineering

Illuminating the Black Box: A Psychometric Investigation into the Multifaceted Nature of Large Language Models

Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language Models