Dynamic Multi-Reference Generative Prediction for Face Video Compression.

Zhao Wang,Bolin Chen,Yan Ye,Shiqi Wang
DOI: https://doi.org/10.1109/icip46576.2022.9897729
2022-01-01
Abstract:Face videos own abundant structured information and prior knowledge which can be utilized by generative neural networks to achieve ultra-low bitrate compression. However, generative neural network based face video compression suffers from large head motion which may easily result in deformed images. In this paper, the dynamic multi-reference prediction method is proposed for generative face video compression. Specifically, key map is extracted as the compact latent to represent the face image. The key maps of the current frame and multiple reference frames are used together to estimate multiple dense motion maps. The multiple motion maps are further applied to the corresponding reference frames to generate the final prediction of the current frame. Moreover, the reference frame can be dynamically refreshed during encoding to convert large head motion to relatively small motion. Experimental results show that the proposed method achieves superior compression performance compared to the state-of-the-art VVC standard as well as the latest generative face compression frameworks.
What problem does this paper attempt to address?