Object-oriented Video Prediction with Pixel-Level Attention.

Siyuan Wu,Hanli Wang,Qinyu Li
DOI: https://doi.org/10.1145/3240876.3240886
2018-01-01
Abstract:It is really challenging to predict future video contents which can be partly achieved by learning from tons of videos automatically. Among the known video prediction approaches, pixel-level prediction is much harder than label-level prediction because of its dense characteristics. Unlike other researches viewing pixels as units to make pixel prediction, a novel prediction method is proposed in this work to learn the movement of visual objects from the previous frames via pixel-level attention and thus predict future video frames. The proposed method is able to promote video prediction performances by making object-oriented prediction instead of pixel-oriented in a long-term frame prediction manner, up to around one second. Two large real-world video prediction datasets are employed to carry out comparative experiments to demonstrate the effectiveness of the proposed method.
What problem does this paper attempt to address?