SC-CDM: Enhancing Quality of Image Semantic Communication with a Compact Diffusion Model

Kexin Zhang,Lixin Li,Wensheng Lin,Yuna Yan,Wenchi Cheng,Zhu Han
2024-10-03
Abstract:Semantic Communication (SC) is an emerging technology that has attracted much attention in the sixth-generation (6G) mobile communication systems. However, few literature has fully considered the perceptual quality of the reconstructed image. To solve this problem, we propose a generative SC for wireless image transmission (denoted as SC-CDM). This approach leverages compact diffusion models to improve the fidelity and semantic accuracy of the images reconstructed after transmission, ensuring that the essential content is preserved even in bandwidth-constrained environments. Specifically, we aim to redesign the swin Transformer as a new backbone for efficient semantic feature extraction and compression. Next, the receiver integrates the slim prior and image reconstruction networks. Compared to traditional Diffusion Models (DMs), it leverages DMs' robust distribution mapping capability to generate a compact condition vector, guiding image recovery, thus enhancing the perceptual details of the reconstructed images. Finally, a series of evaluation and ablation studies are conducted to validate the effectiveness and robustness of the proposed algorithm and further increase the Peak Signal-to-Noise Ratio (PSNR) by over 17% on top of CNN-based DeepJSCC.
Image and Video Processing,Machine Learning,Networking and Internet Architecture
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the sixth - generation (6G) mobile communication system, the existing semantic communication (Semantic Communication, SC) techniques do not fully consider the perceptual quality of the reconstructed image during the image transmission process. Specifically, traditional methods focus more on pixel - level distortion or structural similarity, while ignoring perceptual distortion, that is, the visual authenticity and detail retention of the image. To solve this problem, the author proposes a generative semantic communication framework (SC - CDM) based on the compact diffusion model, aiming to improve the fidelity and semantic accuracy in wireless image transmission and ensure that the core content of the image can be retained even in a bandwidth - limited environment. The following are the main contributions of the paper: 1. **Design a new semantic communication architecture**: - Use Swin Transformer as the backbone network for efficient semantic feature extraction and compression, avoiding the need for accurate bit - stream recovery, thereby reducing communication costs. - By redesigning Swin Transformer, the computational efficiency and the flexibility of feature processing are improved. 2. **Introduce the compact diffusion model**: - Use the compact diffusion model in the generative semantic communication framework to reconstruct complex scenes from compressed semantic information. - Utilize the powerful distribution mapping ability of the diffusion model to calculate compact conditional vectors to guide image restoration, significantly reducing computational requirements. 3. **Experimental verification and performance improvement**: - Through a series of evaluations and ablation studies, the effectiveness and robustness of the proposed algorithm are verified. - The experimental results show that compared with the traditional CNN - based DeepJSCC method, SC - CDM has an improvement of more than 17% in peak signal - to - noise ratio (PSNR). In general, this paper proposes a new semantic communication framework by combining the compact diffusion model and the improved Swin Transformer, aiming to improve the quality and efficiency of image transmission, especially in a bandwidth - limited environment.