Screenplay Summarization Using Latent Narrative Structure

Pinelopi Papalampidi,Frank Keller,Lea Frermann,Mirella Lapata

DOI: https://doi.org/10.48550/arXiv.2004.12727

2020-04-27

Abstract:Most general-purpose extractive summarization models are trained on news articles, which are short and present all important information upfront. As a result, such models are biased on position and often perform a smart selection of sentences from the beginning of the document. When summarizing long narratives, which have complex structure and present information piecemeal, simple position heuristics are not sufficient. In this paper, we propose to explicitly incorporate the underlying structure of narratives into general unsupervised and supervised extractive summarization models. We formalize narrative structure in terms of key narrative events (turning points) and treat it as latent in order to summarize screenplays (i.e., extract an optimal sequence of scenes). Experimental results on the CSI corpus of TV screenplays, which we augment with scene-level summarization labels, show that latent turning points correlate with important aspects of a CSI episode and improve summarization performance over general extractive algorithms leading to more complete and diverse summaries.

Computation and Language

What problem does this paper attempt to address?

The problem this paper attempts to address is that existing extractive summarization models are primarily trained on news articles, which are usually short and present all important information at the beginning. As a result, these models tend to select sentences from the beginning of the document, and this position-based heuristic approach is not suitable for long narrative texts (such as scripts) because long narrative texts have complex structures and information is presented gradually. To address this, the authors propose a method that explicitly incorporates narrative structure into general unsupervised and supervised extractive summarization models to generate more complete and diverse script summaries. Specifically, the authors address this problem in the following ways: 1. **Introducing Narrative Structure**: Formalizing the narrative structure as key narrative events (turning points) and treating them as latent variables. 2. **Improving Summarization Algorithms**: Enhancing existing extractive summarization algorithms by incorporating narrative structure to improve the quality of summaries for long narrative texts. 3. **Experimental Validation**: Conducting experiments on the CSI TV series script dataset to verify the improvement in summarization performance after incorporating narrative structure. Through these methods, the authors hope that the generated summaries can better capture the key events of the story, thereby more completely and diversely reflecting the content of the scripts.

Screenplay Summarization Using Latent Narrative Structure

SummScreen: A Dataset for Abstractive Screenplay Summarization

ScreenWriter: Automatic Screenplay Generation and Movie Summarisation

Movie Summarization via Sparse Graph Construction

MovieSum: An Abstractive Summarization Dataset for Movie Screenplays

Select and Summarize: Scene Saliency for Movie Script Summarization

"Previously on ..." From Recaps to Story Summarization

Video Summarization with Long Short-term Memory

DiscoGraMS: Enhancing Movie Screen-Play Summarization using Movie Character-Aware Discourse Graph

Summarizing Complex Events: a Cross-Modal Solution of Storylines Extraction and Reconstruction.

Towards Automatic Textual Summarization of Movies

NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization

SNaC: Coherence Error Detection for Narrative Summarization

Abstractive Summarization Guided by Latent Hierarchical Document Structure

Novel Chapter Abstractive Summarization using Spinal Tree Aware Sub-Sentential Content Selection

LLM Based Multi-Document Summarization Exploiting Main-Event Biased Monotone Submodular Content Extraction

Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization

Movie Plot Analysis via Turning Point Identification

Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents

Computing and Exploiting Document Structure to Improve Unsupervised Extractive Summarization of Legal Case Decisions

Hierarchical Encoders for Modeling and Interpreting Screenplays