Assessing Box Office Performance Using Movie Scripts: A Kernel-Based Approach

Jehoshua Eliashberg,Sam K. Hui,Z. John Zhang
DOI: https://doi.org/10.1109/TKDE.2014.2306681
2014-01-01
Abstract:We develop a methodology to predict box office performance of a movie at the point of green-lighting, when only its script and estimated production budget are available. We extract three levels of textual features (genre and content, semantics, and bag-of-words) from scripts using screenwriting domain knowledge, human input, and natural language processing techniques. These textual variables define a distance metric across scripts, which is then used as an input for a kernel-based approach to assess box office performance. We show that our proposed methodology predicts box office revenues more accurately (29 percent lower mean squared error (MSE)) compared to benchmark methods.
What problem does this paper attempt to address?