Convolutional Neural Network-Based Fractional-Pixel Motion Compensation
Ning Yan,Dong Liu,Houqiang Li,Bin Li,Li,Feng Wu
DOI: https://doi.org/10.1109/tcsvt.2018.2816932
IF: 5.859
2019-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Fractional-pixel motion compensation (MC) improves the efficiency of inter prediction and has been utilized extensively in video coding standards. The traditional methods of fractional-pixel MC usually follow the approach of interpolation, i.e., they adopt different kinds of filters, either fixed or adaptive, to interpolate fractional-pixel values from integer-pixel values in a reference picture. Different from the interpolation approach, in this paper, we formulate the fractional-pixel MC as an inter-picture regression problem, which is to predict the pixel values of the current to-be-coded picture from the integer-pixel values of a reference picture, given a fractional-pixel motion vector that relates the two pictures. We then propose to adopt convolutional neural network (CNN) models to approach the regression problem, inspired by the recent advances of CNN. Accordingly, we propose fractional-pixel reference generation CNN (FRCNN) for both uni-directional and bi-directional MC in video coding. We further investigate how to train FRCNN by using encoded video sequences, and empirically study the effect of different training data and different CNN structures. Moreover, we propose to integrate FRCNN into the high efficiency video coding (HEVC) scheme, and perform a comprehensive set of experiments to evaluate the effectiveness of FRCNN. The experimental results show that our proposed FRCNN achieves on average 3.9%, 2.7%, and 1.3% bits saving compared with HEVC, under low-delay P, low-delay B, and random-access configurations, respectively.