Abstract:Have Large Language Models (LLMs) developed a personality? The short answer is a resounding "We Don't Know!". In this paper, we show that we do not yet have the right tools to measure personality in language models. Personality is an important characteristic that influences behavior. As LLMs emulate human-like intelligence and performance in various tasks, a natural question to ask is whether these models have developed a personality. Previous works have evaluated machine personality through self-assessment personality tests, which are a set of multiple-choice questions created to evaluate personality in humans. A fundamental assumption here is that human personality tests can accurately measure personality in machines. In this paper, we investigate the emergence of personality in five LLMs of different sizes ranging from 1.5B to 30B. We propose the Option-Order Symmetry property as a necessary condition for the reliability of these self-assessment tests. Under this condition, the answer to self-assessment questions is invariant to the order in which the options are presented. We find that many LLMs personality test responses do not preserve option-order symmetry. We take a deeper look at LLMs test responses where option-order symmetry is preserved to find that in these cases, LLMs do not take into account the situational statement being tested and produce the exact same answer irrespective of the situation being tested. We also identify the existence of inherent biases in these LLMs which is the root cause of the aforementioned phenomenon and makes self-assessment tests unreliable. These observations indicate that self-assessment tests are not the correct tools to measure personality in LLMs. Through this paper, we hope to draw attention to the shortcomings of current literature in measuring personality in LLMs and call for developing tools for machine personality measurement.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are: Have large - language models (LLMs) developed personalities? If so, how can we measure the personalities of these models? Specifically, the paper explores the following issues: 1. **Applicability of personality measurement tools**: Are the existing self - assessment tests used to measure human personalities applicable to measuring the personalities of large - language models? 2. **Definition of personality**: In the research, personality is defined as a consistent pattern of behavior exhibited in different situations. The paper focuses on the behavior patterns of these models in the real world, rather than whether they possess human emotions, self - awareness or consciousness. 3. **Key attributes of test reliability**: The paper proposes "Option - Order Symmetry" as a necessary condition for the reliability of self - assessment tests. This property requires that, for the same question, regardless of how the order of options changes, the model's response should remain unchanged. 4. **Limitations of existing methods**: The paper finds that many large - language models do not satisfy option - order symmetry when answering self - assessment tests, which makes the results of these tests unreliable. In addition, models often do not consider specific situations when answering, but give the same answers, further indicating the ineffectiveness of these tests in measuring model personalities. 5. **Inherent biases**: The paper also identifies the inherent biases present in these language models. These biases cause the models to show consistent preferences in certain choices, thus affecting the reliability of the tests. Through these studies, the paper hopes to draw the attention of the academic community to the deficiencies of current methods for measuring the personalities of large - language models and calls for the development of more specific tools to measure machine personalities.

Have Large Language Models Developed a Personality?: Applicability of Self-Assessment Tests in Measuring Personality in LLMs

Self-Assessment Tests are Unreliable Measures of LLM Personality

Identifying Multiple Personalities in Large Language Models with External Evaluation

Self-assessment, Exhibition, and Recognition: a Review of Personality in Large Language Models

You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments

Challenging the Validity of Personality Tests for Large Language Models

Personality Traits in Large Language Models

LMLPA: Language Model Linguistic Personality Assessment

Is Self-knowledge and Action Consistent or Not: Investigating Large Language Model's Personality

Humanity in AI: Detecting the Personality of Large Language Models

Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language Models

PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits

Illuminating the Black Box: A Psychometric Investigation into the Multifaceted Nature of Large Language Models

Personality testing of Large Language Models: Limited temporal stability, but highlighted prosociality

Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics

Eliciting Personality Traits in Large Language Models

Can Large Language Models Assess Personality from Asynchronous Video Interviews? A Comprehensive Evaluation of Validity, Reliability, Fairness, and Rating Patterns

Revisiting the Reliability of Psychological Scales on Large Language Models

LLMs Simulate Big Five Personality Traits: Further Evidence