Lightweight Diffusion Models for Resource-Constrained Semantic Communication

Giovanni Pignata,Eleonora Grassucci,Giordano Cicchetti,Danilo Comminiello
2024-10-03
Abstract:Recently, generative semantic communication models have proliferated as they are revolutionizing semantic communication frameworks, improving their performance, and opening the way to novel applications. Despite their impressive ability to regenerate content from the compressed semantic information received, generative models pose crucial challenges for communication systems in terms of high memory footprints and heavy computational load. In this paper, we present a novel Quantized GEnerative Semantic COmmunication framework, Q-GESCO. The core method of Q-GESCO is a quantized semantic diffusion model capable of regenerating transmitted images from the received semantic maps while simultaneously reducing computational load and memory footprint thanks to the proposed post-training quantization technique. Q-GESCO is robust to different channel noises and obtains comparable performance to the full precision counterpart in different scenarios saving up to 75% memory and 79% floating point operations. This allows resource-constrained devices to exploit the generative capabilities of Q-GESCO, widening the range of applications and systems for generative semantic communication frameworks. The code is available at <a class="link-external link-https" href="https://github.com/ispamm/Q-GESCO" rel="external noopener nofollow">this https URL</a>.
Signal Processing
What problem does this paper attempt to address?
The main problem this paper attempts to address is the deployment challenge of generative semantic communication models on resource-constrained devices. Although generative models perform well in semantic communication frameworks, they typically require substantial computational resources and memory, making them difficult to run efficiently on mobile devices and edge computing platforms. Specifically, while Diffusion Models excel in generating high-quality images, their high computational load and large memory requirements make them challenging to deploy in resource-limited environments. To tackle this challenge, the authors propose a new framework called Q-GESCO (Quantized GEnerative Semantic COmmunication). Q-GESCO significantly reduces the computational load and memory footprint of the model through Post-Training Quantization (PTQ) techniques while maintaining generation performance comparable to full-precision models. The specific improvements include: 1. **Quantized Generative Model**: The model parameters are compressed from 32-bit floating-point numbers to 8-bit integers using quantization techniques, significantly reducing memory usage and computational load. 2. **Noise-Aware Calibration**: A noise-aware calibration data sampling mechanism is introduced to ensure the model's robustness under different noise conditions. 3. **Time-Step-Aware Calibration**: Intermediate inputs are uniformly sampled at different time steps to generate a calibration dataset, better simulating the activation distribution in real-world applications. 4. **Blockwise Quantization Technique**: Quantization is performed separately in different blocks of the model to manage abnormal activations and weight distributions in shortcut layers, further reducing the accumulation of quantization errors. Experimental results show that Q-GESCO can generate high-quality images while saving memory and computational resources, and it performs well under different noise conditions. This enables Q-GESCO to run efficiently on resource-constrained devices, broadening the application scope of generative semantic communication frameworks.