Horizontal-to-Vertical Video Conversion

Tun Zhu,Daoxin Zhang,Yao Hu,Tianran Wang,Xiaolong Jiang,Jianke Zhu,Jiawei Li
DOI: https://doi.org/10.1109/tmm.2021.3092202
IF: 7.3
2021-01-01
IEEE Transactions on Multimedia
Abstract:At this blooming age of social media and mobile platform, mass consumers are migrating from horizontal video to vertical contents delivered on hand-held devices. Accordingly, revitalizing the exposure of horizontal video becomes vital and urgent, which is hereby tackled by our automated horizontal-to-vertical (abbreviated as H2V) video conversion framework. Essentially, the H2V framework performs subject-preserving video cropping instantiated in the proposed Rank-SS module. Rank-SS incorporates object detection to discover the candidate subjects, from which we select the primary subject-to-preserve leveraging location, appearance, and salient cues in a convolutional neural network. In addition to converting horizontal videos vertically by cropping around the selected subject, automatic shot detection and multi-object tracking are integrated into the H2V framework to accommodate long and complex videos. To develop H2V systems, we collect an H2V-142 K dataset containing 125 videos (132 K frames) and 9500 cover images annotated with primary subject bounding boxes. On H2V-142 K and public object detection datasets, our method demonstrates promising results on the subject selection comparing to the related solutions. Furthermore, our H2V framework is industrially deployed hosting millions of daily active users and exhibits favorable H2V conversion performance. By making this dataset as well as our approach publicly available, we wish to pave the way for more horizontal-to-vertical video conversion research. Our collected H2V-142 K dataset is available at https://tianchi.aliyun.com/dataset/dataDetail?dataId=93339.
computer science, information systems,telecommunications, software engineering
What problem does this paper attempt to address?