Improved Nonlinear Transform Source-Channel Coding to Catalyze Semantic Communications

Sixian Wang,Jincheng Dai,Xiaoqi Qin,Zhongwei Si,Kai Niu,Ping Zhang
2023-08-18
Abstract:Recent deep learning methods have led to increased interest in solving high-efficiency end-to-end transmission problems. These methods, we call nonlinear transform source-channel coding (NTSCC), extract the semantic latent features of source signal, and learn entropy model to guide the joint source-channel coding with variable rate to transmit latent features over wireless channels. In this paper, we propose a comprehensive framework for improving NTSCC, thereby higher system coding gain, better model versatility, and more flexible adaptation strategy aligned with semantic guidance are all achieved. This new sophisticated NTSCC model is now ready to support large-size data interaction in emerging XR, which catalyzes the application of semantic communications. Specifically, we propose three useful improvement approaches. First, we introduce a contextual entropy model to better capture the spatial correlations among the semantic latent features, thereby more accurate rate allocation and contextual joint source-channel coding are developed accordingly to enable higher coding gain. On that basis, we further propose response network architectures to formulate versatile NTSCC, i.e., once-trained model supports various rates and channel states that benefits the practical deployment. Following this, we propose an online latent feature editing method to enable more flexible coding rate control aligned with some specific semantic guidance. By comprehensively applying the above three improvement methods for NTSCC, a deployment-friendly semantic coded transmission system stands out finally. Our improved NTSCC system has been experimentally verified to achieve considerable bandwidth saving versus the state-of-the-art engineered VTM + 5G LDPC coded transmission system with lower processing latency.
Signal Processing,Multimedia
What problem does this paper attempt to address?
The paper aims to address the issues faced by current wireless transmission systems when dealing with large-scale data transmission and emerging applications, particularly the key challenges faced by Nonlinear Transform Source-Channel Coding (NTSCC) in practical deployment: 1. **Correlation Modeling**: Existing NTSCC models are inadequate in handling spatial redundancy in images or videos, leading to an inability to fully utilize the correlation between semantic features for more efficient bandwidth allocation. To address this, the paper proposes the introduction of a context model to better capture these correlations and achieve more accurate rate allocation. 2. **Model Compatibility**: The current NTSCC framework requires multiple models to be trained separately for different bandwidth ratios and channel conditions, which is highly inconvenient for practical deployment. The paper proposes a responsive network architecture that allows a single trained model to support a wide range of rate-distortion performance and different wireless channel conditions. 3. **Model Adaptability**: Although existing optimized NTSCC models perform well on the overall dataset, their performance on specific instances may be suboptimal because they only focus on the average low rate-distortion cost on the training set. Therefore, the paper further proposes an online editing method for semantic latent features, enabling the model to adaptively adjust based on the data characteristics and channel conditions of specific instances during the inference phase, thereby improving transmission efficiency. Through improvements in these three aspects, the paper aims to construct a semantic coding transmission system that is easy to deploy and suitable for emerging application scenarios such as Extended Reality (XR). Experimental results show that the improved NTSCC system outperforms existing advanced coding technologies in terms of transmission efficiency and processing delay.