HoLLMwood: Unleashing the Creativity of Large Language Models in Screenwriting via Role Playing

Jing Chen,Xinyu Zhu,Cheng Yang,Chufan Shi,Yadong Xi,Yuxiang Zhang,Junjie Wang,Jiashu Pu,Rongsheng Zhang,Yujiu Yang,Tian Feng
2024-06-18
Abstract:Generative AI has demonstrated unprecedented creativity in the field of computer vision, yet such phenomena have not been observed in natural language processing. In particular, large language models (LLMs) can hardly produce written works at the level of human experts due to the extremely high complexity of literature writing. In this paper, we present HoLLMwood, an automated framework for unleashing the creativity of LLMs and exploring their potential in screenwriting, which is a highly demanding task. Mimicking the human creative process, we assign LLMs to different roles involved in the real-world scenario. In addition to the common practice of treating LLMs as ${Writer}$, we also apply LLMs as ${Editor}$, who is responsible for providing feedback and revision advice to ${Writer}$. Besides, to enrich the characters and deepen the plots, we introduce a role-playing mechanism and adopt LLMs as ${Actors}$ that can communicate and interact with each other. Evaluations on automatically generated screenplays show that HoLLMwood substantially outperforms strong baselines in terms of coherence, relevance, interestingness and overall quality.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the inadequacies of large language models (LLMs) in literary creation, particularly in screenwriting. Despite generative AI demonstrating unprecedented creativity in artistic creation, especially in the field of computer vision, this phenomenon has not yet been observed in natural language processing, particularly in literary writing. Specifically, current LLMs struggle to produce literary works comparable to those of human experts due to the high complexity of literary writing. The paper proposes an automated framework named **HOLLM WOOD** that unleashes the creativity of LLMs through a role-playing mechanism and explores their potential in the demanding task of screenwriting. This framework simulates the human creative process by assigning LLMs to different roles, such as **Writer**, **Editor**, and **Actors**. These roles are responsible for writing the story, providing feedback and revision suggestions, and enriching character dialogues and interactions through role-playing. ### Main Contributions 1. **Experimental results reveal the difficulty of LLMs in generating high-quality literary works under simple guidance**: - Particularly in generating scripts with vivid characters and engaging plots, LLMs perform poorly. - Dialogues and interactions often appear mechanical and dull, indicating the challenges of directly applying LLMs to creative tasks. 2. **Proposes a fully automated screenwriting framework HOLLM WOOD**: - This framework not only enables non-professionals to create engaging scripts but also provides auxiliary tools for industry professionals. - Users only need to provide an initial storyline, and the framework can automatically handle complex tasks, democratizing a field traditionally requiring extensive experience and specific skills. 3. **Experimental evaluation shows the superiority of HOLLM WOOD in multiple dimensions**: - Using GPT-4 for pairwise comparison, results show that scripts generated by HOLLM WOOD significantly outperform other methods in coherence, relevance, and interestingness. - Ablation experiments further demonstrate the positive contribution of the feedback-revision mechanism and role-playing mechanism to the final script quality. ### Experimental Setup and Results - **Dataset**: Initial storylines of different movie genres synthesized by LLMs were used as input, including 6 types: romance, sci-fi, horror, drama, crime, and comedy, with 10 examples generated for each type, totaling 60 instances. - **Baseline Methods**: Including Plan-then-Write and DOC-screen methods, which generate scripts for each episode sequentially based on designed roles and outlines, or generate story chapters using DOC and then generate scripts. - **Evaluation**: Pairwise comparison using GPT-4, evaluating scripts from four dimensions: coherence, relevance, interestingness, and overall quality. Results show that HOLLM WOOD significantly outperforms baseline methods in all dimensions, especially in interestingness and overall quality. ### Conclusion HOLLM WOOD significantly enhances the performance of LLMs in screenwriting tasks through role-playing and feedback-revision mechanisms, enabling them to generate high-quality scripts. This framework not only helps non-professionals create engaging works but also provides powerful auxiliary tools for industry professionals. In the future, as the capabilities of foundational models improve, this framework is expected to generate scripts approaching human-level quality.