Deep Inter Prediction Via Pixel-Wise Motion Oriented Reference Generation

Sifeng Xia,Wenhan Yang,Yueyu Hu,Jiaying Liu
DOI: https://doi.org/10.1109/icip.2019.8803148
2019-01-01
Abstract:Inter prediction is an important module in video coding for temporal redundancy removal, where the reference blocks are searched from the previously coded frames and employed to predict the block to be coded. However, apart from regular block-wise shift motion, there usually exists inconsistent pixel-wise motion such as rotation and deformation between blocks, which will largely degrade the prediction performance. In this paper, we propose a Multiscale Adaptive Separable Convolutional Neural Network (MASCNN) to generate pixel-wise closer reference frames for inter prediction. A multiscale network is built to interpolate the target frame from coarse to fine. Reconstruction losses are enforced on each scale to make the network infer the main structure at small scales, which improves the interpolation accuracy of the network. Furthermore, a sum of absolute transformed difference (SATD) loss function is proposed to regularize the network training, which further improves the coding performance. Compared with HEVC, our method can obtain on average 5.7% BD-rate saving and up to 9.9% BD-rate saving for the luma component under the random access configuration.
What problem does this paper attempt to address?