Temporal Enhanced Hybrid Neural Representation for Video Compression

Jinxiang Wang,Yangdong Liu,Shiping Zhu,Cheng Feng
DOI: https://doi.org/10.1109/pcs60826.2024.10566352
2024-01-01
Abstract:Implicit neural representation methods are employed to model each video, and they can be broadly categorized into two groups: index-based methods and hybrid methods. Index-based NeRVs generate embeddings solely based on frame indices, lacking specific information about the video content. Conversely, hybrid NeRVs solely generate video content embeddings, disregarding the positive impact of temporal cues during the fitting process. To address these limitations, we propose a novel approach called Temporal Enhanced Hybrid Neural Representation for Videos (TNeRV). TNeRV incorporates temporal modulation and diversity exploration to enhance the fitting process of the decoder. Initially, we introduce the Temporal Diversity Exploration (TDE) block to generate video-diversity embeddings in addition to the video-specific embeddings, enabling the decoder to accurately perceive and adapt to temporal changes within the video. Next, we design the Temporal Modulation Fusion (TMF) block, which combines the two types of embeddings and integrates temporal cues to improve the fitting performance of the decoder. Finally, we conduct a comprehensive evaluation of TNeRV against state-of-the-art methods in video regression and video compression tasks, demonstrating that TNeRV outperforms existing implicit methods.
What problem does this paper attempt to address?