Abstract:The manual assessment of creativity by human raters is coupled with unavoidable subjectivity and often costs much time and human resources. To address these issues, this paper explores how to apply natural language processing (NLP) methods to the assessment of creativity. Using the Alternative Use Task (AUT), participants were encouraged to generate ideas as fast as possible for a fixed time. It was hypothesized that the similarity of ideas would decrease over time in the AUT, considering the design fixation and the limitation of working memory. In the first study, 12 university students completed the AUT in paper-pencil form and generated a total of 376 responses. We applied two NLP models, namely BERT (Bidirectional Encoder Representations from Transformers) and USE (Universal Sentence Encoder), to assess the similarity of responses between individuals. The results did not confirm our hypothesis. One prominent reason might be that the applied models represent millions of sentence structures that are over-ecological and too dissimilar to the sentence structures participants had used while finishing the AUT. Nevertheless, the results did show that BERT and USE could more accurately express the semantic information of responses pace with the Latent Semantic Analysis, a popular computer-aided model for AUT response assessment. In study 2, we proposed an algorithm to reanalyze the 376 responses in study 1 based on word embedding with crowdsourced responses. There were 1690 crowdsourced responses collected from 550 participants who completed an online version of the AUT. The results supported our hypothesis and showed that the similarity of responses increases as time passes. This indicates the proposed algorithm would alleviate the influence of sentence structure in AUT tasks. The differences between BERT, USE, and proposed algorithms are discussed in relation to the assessment of creativity, and the implications for future work are explored in-depth.

Automatic Scoring of Metaphor Creativity with Large Language Models

Automated Scoring of Scientific Creativity in German

The Language of Creativity: Evidence from Humans and Large Language Models

Assessing and Understanding Creativity in Large Language Models

Multilingual semantic distance: Automatic verbal creativity assessment in many languages.

Characterising the Creative Process in Humans and Large Language Models

On the Creativity of Large Language Models

Evaluating Creative Short Story Generation in Humans and Large Language Models

Large Language Model Displays Emergent Ability to Interpret Novel Literary Metaphors

Automating creativity assessment with SemDis: An open platform for computing semantic distance

Metaphor Generation with Conceptual Mappings

Divergent Creativity in Humans and Large Language Models

Probing the Creativity of Large Language Models: Can models produce divergent semantic association?

Steering Large Language Models to Evaluate and Amplify Creativity

Human Creativity in the Age of LLMs: Randomized Experiments on Divergent and Convergent Thinking

Art or Artifice? Large Language Models and the False Promise of Creativity

Exploring the Terrain of Metaphor Novelty: A Regression-Based Approach for Automatically Scoring Metaphors

Natural Language Processing Algorithms for Divergent Thinking Assessment

How to Avoid Sentences Spelling Boring? Towards a Neural Approach to Unsupervised Metaphor Generation

AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text

Semantic Structure of Creative Metaphors