Are Large Language Models Capable of Generating Human-Level Narratives?

Yufei Tian,Tenghao Huang,Miri Liu,Derek Jiang,Alexander Spangher,Muhao Chen,Jonathan May,Nanyun Peng

2024-10-05

Abstract:This paper investigates the capability of LLMs in storytelling, focusing on narrative development and plot progression. We introduce a novel computational framework to analyze narratives through three discourse-level aspects: i) story arcs, ii) turning points, and iii) affective dimensions, including arousal and valence. By leveraging expert and automatic annotations, we uncover significant discrepancies between the LLM- and human- written stories. While human-written stories are suspenseful, arousing, and diverse in narrative structures, LLM stories are homogeneously positive and lack tension. Next, we measure narrative reasoning skills as a precursor to generative capacities, concluding that most LLMs fall short of human abilities in discourse understanding. Finally, we show that explicit integration of aforementioned discourse features can enhance storytelling, as is demonstrated by over 40% improvement in neural storytelling in terms of diversity, suspense, and arousal.

Computation and Language

What problem does this paper attempt to address?

This paper aims to explore the capabilities of large - language models (LLMs) in storytelling, especially their performance in narrative development and plot progression. The researchers introduced a new computational framework to analyze narratives through three aspects at the discourse level: **story arcs**, **turning points** and **affective dimensions**, including arousal and valence. Through the use of expert and automatic annotation, the study revealed significant differences between stories generated by LLMs and those written by humans. Specifically: 1. **Stories written by humans** are usually more suspenseful, engaging, and more diverse in narrative structure. 2. **Stories generated by LLMs** are relatively monotonous, tend to be positive in emotion, and lack a sense of tension. In addition, the study also measured narrative reasoning ability as a prerequisite for generative ability, and concluded that most LLMs are inferior to humans in discourse understanding. Finally, the study found that explicitly integrating the above - mentioned discourse features can significantly improve the quality of narrative generation, for example, increasing it by more than 40% in terms of diversity, suspense, and arousal. In summary, the problem this paper attempts to solve is: **Can large - language models generate narratives at a human - level?** And through specific analysis methods and experimental verification, it has verified the deficiencies of LLMs in narrative generation and possible improvement directions.

Are Large Language Models Capable of Generating Human-Level Narratives?

The Next Chapter: A Study of Large Language Models in Storytelling

Evaluating Creative Short Story Generation in Humans and Large Language Models

Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation

Assessing Language Models' Worldview for Fiction Generation

A Structured Narrative Prompt for Prompting Narratives from Large Language Models: Sentiment Assessment of ChatGPT-Generated Narratives and Real Tweets

Using large language models to create narrative events

Creating Suspenseful Stories: Iterative Planning with Large Language Models

What is Wrong with Language Models that Can Not Tell a Story?

Collaborative Storytelling with Large-scale Neural Language Models

Analyzing Nobel Prize Literature with Large Language Models

Improving Visual Storytelling with Multimodal Large Language Models

Explingo: Explaining AI Predictions using Large Language Models

Can Large Language Models Be an Alternative to Human Evaluations?

StoryAnalogy: Deriving Story-level Analogies from Large Language Models to Unlock Analogical Understanding

Are NLP Models Good at Tracing Thoughts: An Overview of Narrative Understanding

Can Nuanced Language Lead to More Actionable Insights? Exploring the Role of Generative AI in Analytical Narrative Structure

StoryGPT-V: Large Language Models as Consistent Story Visualizers

Large Language Models as Narrative-Driven Recommenders

HoLLMwood: Unleashing the Creativity of Large Language Models in Screenwriting via Role Playing

Tell Me a Story! Narrative-Driven XAI with Large Language Models