Generative Semantic Communication: Diffusion Models Beyond Bit Recovery

Eleonora Grassucci,Sergio Barbarossa,Danilo Comminiello
2023-06-07
Abstract:Semantic communication is expected to be one of the cores of next-generation AI-based communications. One of the possibilities offered by semantic communication is the capability to regenerate, at the destination side, images or videos semantically equivalent to the transmitted ones, without necessarily recovering the transmitted sequence of bits. The current solutions still lack the ability to build complex scenes from the received partial information. Clearly, there is an unmet need to balance the effectiveness of generation methods and the complexity of the transmitted information, possibly taking into account the goal of communication. In this paper, we aim to bridge this gap by proposing a novel generative diffusion-guided framework for semantic communication that leverages the strong abilities of diffusion models in synthesizing multimedia content while preserving semantic features. We reduce bandwidth usage by sending highly-compressed semantic information only. Then, the diffusion model learns to synthesize semantic-consistent scenes through spatially-adaptive normalizations from such denoised semantic information. We prove, through an in-depth assessment of multiple scenarios, that our method outperforms existing solutions in generating high-quality images with preserved semantic information even in cases where the received content is significantly degraded. More specifically, our results show that objects, locations, and depths are still recognizable even in the presence of extremely noisy conditions of the communication channel. The code is available at <a class="link-external link-https" href="https://github.com/ispamm/GESCO" rel="external noopener nofollow">this https URL</a>.
Artificial Intelligence,Multimedia
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address key issues in semantic communication, particularly in the context of the sixth generation (6G) wireless networks. Specifically, the goals of the paper are: 1. **Reconstructing Complex Scenes**: Existing methods face difficulties in reconstructing complex scenes from received partial information. The paper proposes a new generative diffusion-guided framework that leverages the powerful capabilities of diffusion models to synthesize multimedia content while preserving semantic features. 2. **Reducing Bandwidth Usage**: Reducing bandwidth usage by sending only highly compressed semantic information. This approach allows the receiver to recover high-quality images that retain semantic information even under poor channel conditions. 3. **Improving Robustness**: Ensuring that the generated images can still recognize objects, locations, and depth information under extremely adverse channel conditions, thereby outperforming existing solutions. Through these improvements, the paper demonstrates the superior performance of its method under various channel conditions and proves its effectiveness in high-noise environments.