The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories

Xi Yu Huang,Krishnapriya Vishnubhotla,Frank Rudzicz
2024-06-25
Abstract:The improved generative capabilities of large language models have made them a powerful tool for creative writing and storytelling. It is therefore important to quantitatively understand the nature of generated stories, and how they differ from human storytelling. We augment the Reddit WritingPrompts dataset with short stories generated by GPT-3.5, given the same prompts. We quantify and compare the emotional and descriptive features of storytelling from both generative processes, human and machine, along a set of six dimensions. We find that generated stories differ significantly from human stories along all six dimensions, and that human and machine generations display similar biases when grouped according to the narrative point-of-view and gender of the main protagonist. We release our dataset and code at <a class="link-external link-https" href="https://github.com/KristinHuangg/gpt-writing-prompts" rel="external noopener nofollow">this https URL</a>.
Computation and Language
What problem does this paper attempt to address?