Abstract:The manual assessment of creativity by human raters is coupled with unavoidable subjectivity and often costs much time and human resources. To address these issues, this paper explores how to apply natural language processing (NLP) methods to the assessment of creativity. Using the Alternative Use Task (AUT), participants were encouraged to generate ideas as fast as possible for a fixed time. It was hypothesized that the similarity of ideas would decrease over time in the AUT, considering the design fixation and the limitation of working memory. In the first study, 12 university students completed the AUT in paper-pencil form and generated a total of 376 responses. We applied two NLP models, namely BERT (Bidirectional Encoder Representations from Transformers) and USE (Universal Sentence Encoder), to assess the similarity of responses between individuals. The results did not confirm our hypothesis. One prominent reason might be that the applied models represent millions of sentence structures that are over-ecological and too dissimilar to the sentence structures participants had used while finishing the AUT. Nevertheless, the results did show that BERT and USE could more accurately express the semantic information of responses pace with the Latent Semantic Analysis, a popular computer-aided model for AUT response assessment. In study 2, we proposed an algorithm to reanalyze the 376 responses in study 1 based on word embedding with crowdsourced responses. There were 1690 crowdsourced responses collected from 550 participants who completed an online version of the AUT. The results supported our hypothesis and showed that the similarity of responses increases as time passes. This indicates the proposed algorithm would alleviate the influence of sentence structure in AUT tasks. The differences between BERT, USE, and proposed algorithms are discussed in relation to the assessment of creativity, and the implications for future work are explored in-depth.

Natural Language Processing Algorithms for Divergent Thinking Assessment

Automating creativity assessment with SemDis: An open platform for computing semantic distance

Multilingual semantic distance: Automatic verbal creativity assessment in many languages.

Automatic assessment of divergent thinking in Chinese language with TransDis: A transformer-based language model approach

Fit-For-Purpose Creativity Assessment: Automatic Scoring of the Test of Creative Thinking – Drawing Production (TCT-DP)

A Comparison Study of Human and Machine-Generated Creativity

Putting GPT-3's Creativity to the (Alternative Uses) Test

Assessing and Understanding Creativity in Large Language Models

Automated Scoring of Scientific Creativity in German

Where Divergent Ideas Converge: Answers to AUT Found on Short List of Word Co-Occurrences Terms

Creativity assessment in neuroscience research.

The application of explainable artificial intelligence methods to models for automatic creativity assessment

Use or Consequences: Probing the Cognitive Difference Between Two Measures of Divergent Thinking

A Study of the Exploratory Creativity Performance Between Machine and Human Designers

Automatic Scoring of Metaphor Creativity with Large Language Models

The Language of Creativity: Evidence from Humans and Large Language Models

Creative Connections: Computational Semantic Distance Captures Individual Creativity and Resting-State Functional Connectivity

Probing the Creativity of Large Language Models: Can models produce divergent semantic association?

Do LLMs Agree on the Creativity Evaluation of Alternative Uses?

Human Creativity in the Age of LLMs: Randomized Experiments on Divergent and Convergent Thinking

The creative psychometric item generator: a framework for item generation and validation using large language models