Abstract:Background:Personality psychology studies personality and its variation among individuals and is an essential branch of psychology. In recent years, machine learning research related to personality assessment has started to focus on the online environment and showed outstanding performance in personality assessment. However, the aspects of the personality of these prediction models measure remain unclear because few studies focus on the interpretability of personality prediction models. The objective of this study is to develop and validate a machine learning model with domain knowledge introduced to enhance accuracy and improve interpretability.Methods:Study participants were recruited via an online experiment platform. After excluding unqualified participants and downloading the Weibo posts of eligible participants, we used six psycholinguistic and mental health-related lexicons to extract textual features. Then the predictive personality model was developed using the multi-objective extra trees method based on 3,411 pairs of social media expression and personality trait scores. Subsequently, the prediction model's validity and reliability were evaluated, and each lexicon's feature importance was calculated. Finally, the interpretability of the machine learning model was discussed.Results:The features from Culture Value Dictionary were found to be the most important predictors. The fivefold cross-validation results regarding the prediction model for personality traits ranged between 0.44 and 0.48 (p < 0.001). The correlation coefficients of five personality traits between the two "split-half" datasets data ranged from 0.84 to 0.88 (p < 0.001). Moreover, the model performed well in terms of contractual validity.Conclusion:By introducing domain knowledge to the development of a machine learning model, this study not only ensures the reliability and validity of the prediction model but also improves the interpretability of the machine learning method. The study helps explain aspects of personality measured by such prediction models and finds a link between personality and mental health. Our research also has positive implications regarding the combination of machine learning approaches and domain knowledge in the field of psychiatry and its applications to mental health.

An Open-source Benchmark of Deep Learning Models for Audio-visual Apparent and Self-reported Personality Recognition

Integrating audio and visual modalities for multimodal personality trait recognition via hybrid deep learning

Video-based multimodal personality analysis

Predicting Personality based on Self-Introduction Video

Multimodal analysis of personality traits on videos of self-presentation and induced behavior

How Social Media Expression Can Reveal Personality.

Domain-specific Learning of Multi-scale Facial Dynamics for Apparent Personality Traits Prediction

Deep Bimodal Regression For Apparent Personality Analysis

Personality Analysis from Online Short Video Platforms with Multi-domain Adaptation

Speech Personality Recognition Based on Annotation Classification Using Log-Likelihood Distance and Extraction of Essential Audio Features.

Bi-modal First Impressions Recognition using Temporally Ordered Deep Audio and Stochastic Visual Features

Humanity in AI: Detecting the Personality of Large Language Models

Getting Personal: A Deep Learning Artifact for Text-Based Measurement of Personality

Deep Bimodal Regression of Apparent Personality Traits from Short Video Sequences.

Multimodal Video-based Apparent Personality Recognition Using Long Short-Term Memory and Convolutional Neural Networks

Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues

Evaluating and Inducing Personality in Pre-trained Language Models

Self-assessment, Exhibition, and Recognition: a Review of Personality in Large Language Models

Automatic Personality Perception from Speech in Mandarin

Deep learning-based personality recognition from text posts of online social networks

Comparing approaches for mitigating intergroup variability in personality recognition