Beyond Keypoint Coding: Temporal Evolution Inference with Compact Feature Representation for Talking Face Video Compression

Bolin Chen,Zhao Wang,Binzhe Li,Rongqun Lin,Shiqi Wang,Yan Ye
DOI: https://doi.org/10.1109/dcc52660.2022.00009
2022-01-01
Abstract:We propose a talking face video compression framework by implicitly transforming the temporal evolution into compact feature representation. More specifically, the temporal evolution of faces, which is complex, non-linear and difficult to extrapolate, is modelled in an end-to-end inference framework based upon very compact features. This enables the high-quality rendering of the face videos, which benefits from the learning of dense motion map with compact feature representation. Therefore, the proposed framework can accommodate ultra-low bandwidth video communication and maintain the quality of the reconstructed videos. Experimental results demonstrate that compared with the state-of-the-art video coding standard Versatile Video Coding (VVC) as well as the latest generative compression scheme Face Video-to-Video Synthesis (Face_vid2vid), the proposed scheme is superior in terms of both objective and subjective quality assessment methods.
What problem does this paper attempt to address?