Camera Pose-Based Background Modeling for Video Coding in Moving Cameras

Zheng Fang,Mingkui Zheng,Pingping Chen,Zhifeng Chen,Dapeng Oliver Wu
DOI: https://doi.org/10.1109/tcsvt.2023.3318257
IF: 5.859
2024-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:For moving cameras, the video content changes significantly, which leads to inaccurate prediction in traditional inter prediction and results in limited compression efficiency. To solve these problems, first, we propose a camera pose-based background modeling (CP-BM) framework that uses the camera motion and the textures of reconstructed frames to model the background of the current frame. Compared with the reconstructed frames, the predicted background frame generated by CP-BM is more geometrically similar to the current frame in position and is more strongly correlated with it at the pixel level; thus, it can serve as a higher-quality reference for inter prediction, and the compression efficiency can be improved. Second, to compensate the motion of the background pixels, we construct a pixel-level motion vector field that can accurately describe various complex motions with only a small overhead. Our method is more general than other motion models because it has more degrees of freedom, and when the degrees of freedom are decreased, it encompasses other motion models as special cases. Third, we propose an optical flow-based depth estimation (OF-DE) method to synchronize the depth information at the codec, which is used to build the motion vector field. Finally, we integrate the overall scheme into the High Efficiency Video Coding (HEVC) and Versatile Video Coding (VVC) reference software HM-16.7 and VTM-10.0. Experimental results demonstrate that in HM-16.7, for in-vehicle video sequences, our solution has an average Bjøntegaard delta bit rate (BD-rate) gain of 8.02% and reduces the encoding time by 20.9% due to the superiority of our scheme in motion estimation. Moreover, in VTM-10.0 with affine motion compensation (MC) turned off and turned on, our method has average BD-rate gains of 5.68% and 0.56%, respectively.
What problem does this paper attempt to address?