Abstract:Storytelling is a fundamental aspect of human communication, relying heavily on creativity to produce narratives that are novel, appropriate, and surprising. While large language models (LLMs) have recently demonstrated the ability to generate high-quality stories, their creative capabilities remain underexplored. Previous research has either focused on creativity tests requiring short responses or primarily compared model performance in story generation to that of professional writers. However, the question of whether LLMs exhibit creativity in writing short stories on par with the average human remains unanswered. In this work, we conduct a systematic analysis of creativity in short story generation across LLMs and everyday people. Using a five-sentence creative story task, commonly employed in psychology to assess human creativity, we automatically evaluate model- and human-generated stories across several dimensions of creativity, including novelty, surprise, and diversity. Our findings reveal that while LLMs can generate stylistically complex stories, they tend to fall short in terms of creativity when compared to average human writers.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to evaluate whether the creativity of large language models (LLMs) in generating short stories can be comparable to that of ordinary humans. Specifically, the researchers focus on the following aspects: 1. **Dimensions of Creativity**: Including novelty, surprise, and diversity. These dimensions are used to measure whether the generated stories are innovative, unexpected, and rich in content. 2. **Task Setup**: A five - sentence creative short - story - generation task was used, which is a method commonly used in psychology to evaluate human creativity. Participants need to write a 4 - 5 - sentence short story based on three prompt words. 3. **Comparison Objects**: It is not limited to professional writers, but the stories generated by LLMs are compared with those of ordinary people. This helps to more comprehensively understand the performance of LLMs in creative writing. 4. **Evaluation Method**: Quantify creativity through automated evaluation metrics to ensure the objectivity and repeatability of the evaluation process. ### Research Background - **Importance of Story Creation**: Storytelling is a core part of human communication and especially relies on creativity to produce novel, appropriate, and surprising narratives. - **Advances and Limitations of LLMs**: Although LLMs have demonstrated the ability to generate high - quality stories, their creativity has not been fully explored. - **Deficiencies of Existing Research**: Most previous studies either focused on creativity tests requiring brief responses or mainly compared the performance of models with professional writers, without involving ordinary people. ### Research Objectives This study aims to systematically analyze the differences in creativity between short stories generated by LLMs and ordinary people, especially to reveal the advantages and disadvantages of LLMs in creative writing through multi - dimensional creativity evaluation metrics. The research results show that although LLMs can generate complex stories, they are still inferior to ordinary people in terms of novelty, diversity, and surprise. ### Main Findings - **Complexity vs. Creativity**: The stories generated by LLMs are more complex in terms of vocabulary and syntax, but have lower readability; while human stories are more novel, surprising, and diverse. - **Impact of Semantic Distance**: When the semantic distance between prompt words is small, the stories generated by humans and LLMs are more novel. - **Pronoun Use**: Humans are more inclined to write from the first - or second - person perspective, while LLMs prefer the third - person. Through these findings, the researchers hope to provide guidance for future research on improving the creativity of LLMs and emphasize the importance of comprehensive evaluation of creativity.

Evaluating Creative Short Story Generation in Humans and Large Language Models

The Language of Creativity: Evidence from Humans and Large Language Models

Small Language Models can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs

On the Creativity of Large Language Models

Characterising the Creative Process in Humans and Large Language Models

Are Large Language Models Capable of Generating Human-Level Narratives?

Assessing and Understanding Creativity in Large Language Models

The Next Chapter: A Study of Large Language Models in Storytelling

Divergent Creativity in Humans and Large Language Models

Evaluating Large Language Model Creativity from a Literary Perspective

Large Language Models show both individual and collective creativity comparable to humans

Art or Artifice? Large Language Models and the False Promise of Creativity

Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation

CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints

Steering Large Language Models to Evaluate and Amplify Creativity

Human Creativity in the Age of LLMs: Randomized Experiments on Divergent and Convergent Thinking

Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?

Automatic Scoring of Metaphor Creativity with Large Language Models

Collaborative Storytelling with Large-scale Neural Language Models

Generative AI enhances individual creativity but reduces the collective diversity of novel content