Semantic Communication via Rate Distortion Perception Bottleneck

Zihe Zhao,Chunyue Wang
2024-05-16
Abstract:With the advancement of Artificial Intelligence (AI) technology, next-generation wireless communication network is facing unprecedented challenge. Semantic communication has become a novel solution to address such challenges, with enhancing the efficiency of bandwidth utilization by transmitting meaningful information and filtering out superfluous data. Unfortunately, recent studies have shown that classical Shannon information theory primarily focuses on the bit-level distortion, which cannot adequately address the perceptual quality issues of data reconstruction at the receiver end. In this work, we consider the impact of semantic-level distortion on semantic communication. We develop an image inference network based on the Information Bottleneck (IB) framework and concurrently establish an image reconstruction network. This network is designed to achieve joint optimization of perception and bit-level distortion, as well as image inference, associated with compressing information. To maintain consistency with the principles of IB for handling high-dimensional data, we employ variational approximation methods to simplify the optimization problem. Finally, we confirm the existence of the rate distortion perception tradeoff within IB framework through experimental analysis conducted on the MNIST dataset.
Signal Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to find a balance point between transmission efficiency and perceptual quality in semantic communication. Specifically, the traditional Shannon information theory mainly focuses on bit - level distortion, which cannot fully address the problem of perceptual quality in data reconstruction at the receiving end. To overcome this limitation, this paper proposes an image inference network based on the Information Bottleneck (IB) framework and simultaneously establishes an image reconstruction network. These two networks aim to achieve joint optimization of perceptual quality and bit - level distortion, as well as image inference related to information compression. ### Main Contributions 1. **Perceptual Distortion Problem in Semantic Communication**: The paper points out that the existing bit - level distortion metrics (such as Mean Absolute Error (MAE) and Peak Signal - to - Noise Ratio (PSNR)) cannot effectively measure semantic - level distortion. Therefore, this paper introduces perceptual distortion metrics to more comprehensively evaluate the quality of data reconstruction. 2. **Application of the Information Bottleneck Method**: By using the information bottleneck method, the paper transforms the traditional constrained optimization problem into an unconstrained optimization problem, thereby simplifying the optimization process and making the model more likely to converge to the global optimal solution. 3. **Experimental Verification**: The paper conducts experiments on the MNIST dataset to verify the effectiveness of the proposed Rate - Distortion - Perception - Bottleneck (RDPB) model. The experimental results show that this model can achieve a good balance between perceptual quality and bit - level distortion at different transmission rates. ### Technical Details - **System Model**: The paper constructs a task - oriented communication system model, in which the input data \(x\) and the target variable \(y\) are regarded as realizations of a pair of random variables. The encoded feature, the received feature, the inference result, and the recovered image are represented by the random variables \(Z\), \(\hat{Z}\), \(\hat{Y}\), and \(\hat{X}\) respectively. - **Rate - Distortion - Perception - Bottleneck Function**: The paper proposes the following objective function: \[ L_{\text{RDPB}}(\theta)=\mathbb{E}_{p(x,y)}\left[\mathbb{E}_{p_{\phi}(\hat{z}|x)}\left[-\log p_{\theta}(y|\hat{z})\right]+\beta D_{\text{KL}}(p_{\phi}(\hat{z}|x)\|p(\hat{z}))+\lambda\mathbb{E}[\Delta(X,\hat{X})]+\mu D_{\text{KL}}(p_X,p_{\hat{X}})\right] \] where \(D_{\text{KL}}\) represents the Kullback - Leibler divergence, which is used to measure the difference between distributions; \(\Delta(X,\hat{X})\) represents the perceptual distortion metric, such as Mean Squared Error (MSE). - **Variational Approximation**: To deal with the computational complexity of high - dimensional distributions, the paper adopts the variational approximation method and introduces variational distributions \(q(z)\) and \(q(y|\hat{z})\) to approximate the real distributions \(p(z)\) and \(p(y|\hat{z})\). ### Experimental Results - **Rate - Distortion - Perception Curve**: The paper shows the trade - off curve between perceptual quality and bit - level distortion at different transmission rates. The experimental results show that as the transmission rate increases, the perceptual quality is significantly improved, while the bit - level distortion remains at a low level. - **t - SNE Visualization**: Through the t - SNE visualization technique, the paper shows the distribution of the noisy feature vector \(\hat{z}\) in the MNIST classification task. The results show that even at a lower accuracy rate, the enhanced semantic awareness enables the inference network to extract higher - quality information from the feature vector. ### Conclusion In this paper, by combining the information bottleneck method and the variational approximation technique, the problem of perception in semantic communication has been successfully solved.