Editing Personality for Large Language Models

Shengyu Mao,Xiaohan Wang,Mengru Wang,Yong Jiang,Pengjun Xie,Fei Huang,Ningyu Zhang

2024-09-01

Abstract:This paper introduces an innovative task focused on editing the personality traits of Large Language Models (LLMs). This task seeks to adjust the models' responses to opinion-related questions on specified topics since an individual's personality often manifests in the form of their expressed opinions, thereby showcasing different personality traits. Specifically, we construct PersonalityEdit, a new benchmark dataset to address this task. Drawing on the theory in Social Psychology, we isolate three representative traits, namely Neuroticism, Extraversion, and Agreeableness, as the foundation for our benchmark. We then gather data using GPT-4, generating responses that align with a specified topic and embody the targeted personality trait. We conduct comprehensive experiments involving various baselines and discuss the representation of personality behavior in LLMs. Our findings uncover potential challenges of the proposed task, illustrating several remaining issues. We anticipate that our work can stimulate further annotation in model editing and personality-related research. Code is available at <a class="link-external link-https" href="https://github.com/zjunlp/EasyEdit" rel="external noopener nofollow">this https URL</a>.

Computation and Language,Artificial Intelligence,Computers and Society,Machine Learning,Multiagent Systems

What problem does this paper attempt to address?

The problem this paper attempts to address is how to edit the personality traits of large language models (LLMs). Specifically, the authors propose a new task aimed at adjusting the way LLMs express opinions on specific topics to exhibit different personality traits. To achieve this goal, the authors constructed a new benchmark dataset called PersonalityEdit and selected Neuroticism, Extraversion, and Agreeableness as the foundational personality traits for the study. By utilizing GPT-4 to generate responses that align with specific personality traits, the authors conducted comprehensive experiments on various model editing methods and discussed the manifestation of personality behaviors in LLMs. The main contributions of this work include: 1. For the first time, exploring the challenge of editing the personality traits of LLMs and proposing a benchmark dataset, PersonalityEdit. 2. Using GPT-4 for topic-constrained and personality trait-guided data generation, combined with automatic and manual validation to ensure data quality. 3. Proposing several metrics for evaluating personality traits in generated texts, analyzing different baseline methods, and finding that while existing methods can promote personality editing to some extent, the results are still unsatisfactory, highlighting the difficulty of the task.

Editing Personality for Large Language Models

Editing Personality for LLMs

Editing Large Language Models: Problems, Methods, and Opportunities

BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data

Tailoring Personality Traits in Large Language Models via Unsupervisedly-Built Personalized Lexicons

Personality Traits in Large Language Models

Identifying Multiple Personalities in Large Language Models with External Evaluation

PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits

Extroversion or Introversion? Controlling The Personality of Your Large Language Models

Eliciting Big Five Personality Traits in Large Language Models: A Textual Analysis with Classifier-Driven Approach

ControlLM: Crafting Diverse Personalities for Language Models

Self-assessment, Exhibition, and Recognition: a Review of Personality in Large Language Models

Eliciting Personality Traits in Large Language Models

Humanity in AI: Detecting the Personality of Large Language Models

What makes your model a low-empathy or warmth person: Exploring the Origins of Personality in LLMs

Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits

P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts

EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models

PersLLM: A Personified Training Approach for Large Language Models

PersonaLLM: Investigating the Ability of GPT-3.5 to Express Personality Traits and Gender Differences

Dynamic Generation of Personalities with Large Language Models