Abstract:In response to the rapid growth of global videomtraffic and the limitations of traditional wireless transmission systems, we propose a novel dual-stage vector quantization framework, VQ-DeepVSC, tailored to enhance video transmission over wireless channels. In the first stage, we design the adaptive keyframe extractor and interpolator, deployed respectively at the transmitter and receiver, which intelligently select key frames to minimize inter-frame redundancy and mitigate the cliff-effect under challenging channel conditions. In the second stage, we propose the semantic vector quantization encoder and decoder, placed respectively at the transmitter and receiver, which efficiently compress key frames using advanced indexing and spatial normalization modules to reduce redundancy. Additionally, we propose adjustable index selection and recovery modules, enhancing compression efficiency and enabling flexible compression ratio adjustment. Compared to the joint source-channel coding (JSCC) framework, the proposed framework exhibits superior compatibility with current digital communication systems. Experimental results demonstrate that VQ-DeepVSC achieves substantial improvements in both Multi-Scale Structural Similarity (MS-SSIM) and Learned Perceptual Image Patch Similarity (LPIPS) metrics than the H.265 standard, particularly under low channel signal-to-noise ratio (SNR) or multi-path channels, highlighting the significantly enhanced transmission capabilities of our approach.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the contradiction between the rapid growth of global video traffic and the limitations of traditional wireless transmission systems. Specifically, although traditional wireless video transmission systems have made improvements in optimizing the bit error rate (BER), they still face challenges in ensuring high - quality transmission. These systems mainly focus on compression efficiency, but are insufficient in semantic understanding and adaptability required under dynamic network conditions. Even with the latest H.265 technology, the problem of cliff - effect has not been solved. The cliff - effect refers to the significant decline in video transmission quality when the channel signal - to - noise ratio (SNR) is below a certain critical threshold. To solve these problems, the author proposes a new framework based on two - stage vector quantization - VQ - DeepVSC, to enhance the video transmission performance on wireless channels. This framework is achieved through the following two stages: 1. **First stage**: An adaptive key - frame extractor and an interpolator are designed and deployed at the transmitter and the receiver respectively. The goal of this stage is to reduce inter - frame redundancy by intelligently selecting key frames and mitigate the cliff - effect under poor channel conditions. 2. **Second stage**: A semantic vector quantization encoder and a decoder are proposed and placed at the transmitter and the receiver respectively. This stage efficiently compresses key frames through advanced indexing and spatial normalization modules, further reducing redundancy. In addition, the author also introduces an adjustable exponent selection and recovery module, which improves the compression efficiency and allows for flexible adjustment of the compression ratio. Experimental results show that, compared with the H.265 standard, VQ - DeepVSC shows significant improvements in the multi - scale structural similarity (MS - SSIM) and the learned perceptual image patch similarity (LPIPS) metrics, especially under low channel signal - to - noise ratio or multipath channel conditions, demonstrating stronger transmission capabilities.

VQ-DeepVSC: A Dual-Stage Vector Quantization Framework for Video Semantic Communication

Vector Quantized Semantic Communication System

A Predictive VQ Based Video Compression Scheme

Wireless Deep Video Semantic Transmission

A Multi-Scale Spatial-Temporal Network for Wireless Video Transmission

MDVSC -- Wireless Model Division Video Semantic Communication

A compressed domain multipurpose video watermarking algorithm using vector quantization

Video Transmission over Cognitive Radio Networks

VideoQA-SC: Adaptive Semantic Communication for Video Question Answering

VQ Image Coding Using Sub-Vector Techniques.

Beyond VVC: Towards Perceptual Quality Optimized Video Compression Using Multi-Scale Hybrid Approaches.

Deep Learning Enabled Semantic Communication Systems for Video Transmission

Activation Map-based Vector Quantization for 360-degree Image Semantic Communication

Joint Task and Data Oriented Semantic Communications: A Deep Separate Source-channel Coding Scheme

Wireless Scalable Video Coding Using a Hybrid Digital-Analog Scheme.

DeepWiVe: Deep-Learning-Aided Wireless Video Transmission

Adaptive Wireless Image Semantic Transmission: Design, Simulation, and Prototype Validation

D$^2$-JSCC: Digital Deep Joint Source-channel Coding for Semantic Communications

Multi-stage vector quantization towards low bit rate visual search

MOC-RVQ: Multilevel Codebook-Assisted Digital Generative Semantic Communication