End-To-End Compression for Surveillance Video with Unsupervised Foreground-Background Separation

Yu Zhao,Dengyan Luo,Fuchun Wang,Han Gao,Mao Ye,Ce Zhu
DOI: https://doi.org/10.1109/tbc.2023.3280039
IF: 4.5
2023-01-01
IEEE Transactions on Broadcasting
Abstract:With the exponential growth of surveillance video, efficient video coding method is in great demand. The learning-based methods emerge which either directly use a general video compression framework, or separate the foreground and background and then compress them in two stages. However, they do not take into account the relatively static background fact of surveillance video, or simply separate foreground and background in offline mode which reduces the separation performance because the temporal domain correlation is not considered very well. In this paper, we propose an end-to-end Unsupervised foreground-background separation based Video Compression neural Networks, dubbed as UVCNet. Our method mainly consists of three parts. First, the Mask Net unsupervisely separates foreground and background online which sufficiently uses the temporal correlation prior. Then, a traditional motion estimation-based residual coding module is applied to foreground compression. Simultaneously, a background compression module is applied to compress background residual and update the background by sufficiently using the relatively static property. Compared with previous approaches, our method does not separate foreground and background in advance but in an end-to-end manner. So we can not only use the relatively static background property to save bit rate, but also achieve end-to-end online video compression. Experimental results demonstrate that the proposed UVCNet achieves superior performance compared with the state-of-the-art methods. Specifically, UVCNet can achieve 2.11 dB average improvement on Peak Signal-to-Noise Ratio (PSNR) compared with H.265 on surveillance datasets.
What problem does this paper attempt to address?