Video Highlight Prediction Using Audience Chat Reactions
Cheng-Yang Fu,Joon Lee,Mohit Bansal,Alexander C. Berg
DOI: https://doi.org/10.48550/arXiv.1707.08559
2017-07-26
Computation and Language
Abstract:Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis. We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures.
What problem does this paper attempt to address?