Rate-Adaptive Coding Mechanism for Semantic Communications With Multi-Modal Data

Yangshuo He,Guanding Yu,Yunlong Cai
2023-05-18
Abstract:Recently, the ever-increasing demand for bandwidth in multi-modal communication systems requires a paradigm shift. Powered by deep learning, semantic communications are applied to multi-modal scenarios to boost communication efficiency and save communication resources. However, the existing end-to-end neural network (NN) based framework without the channel encoder/decoder is incompatible with modern digital communication systems. Moreover, most end-to-end designs are task-specific and require re-design and re-training for new tasks, which limits their applications. In this paper, we propose a distributed multi-modal semantic communication framework incorporating the conventional channel encoder/decoder. We adopt NN-based semantic encoder and decoder to extract correlated semantic information contained in different modalities, including speech, text, and image. Based on the proposed framework, we further establish a general rate-adaptive coding mechanism for various types of multi-modal semantic tasks. In particular, we utilize unequal error protection based on semantic importance, which is derived by evaluating the distortion bound of each modality. We further formulate and solve an optimization problem that aims at minimizing inference delay while maintaining inference accuracy for semantic tasks. Numerical results show that the proposed mechanism fares better than both conventional communication and existing semantic communication systems in terms of task performance, inference delay, and deployment complexity.
Signal Processing,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the growing bandwidth demands in multimodal communication systems and proposes a novel distributed multimodal semantic communication framework. Specifically, the paper attempts to solve the following two main issues: 1. **Compatibility Issue**: Existing end-to-end neural network (NN) frameworks lack traditional channel encoders/decoders, making them incompatible with modern digital communication systems. Moreover, most end-to-end designs are task-specific, requiring redesign and retraining for new tasks, which limits their applicability. 2. **Adaptability and Efficiency Issue**: Current research mostly employs end-to-end neural networks to extract semantic information and mitigate the impact of channel noise. However, this design necessitates retraining the neural network when transmitting data, which is often impractical in real multimodal scenarios. For instance, in distributed sensor networks, retraining encoders with various decoders is unrealistic. To address these issues, the paper proposes a distributed multimodal semantic communication framework that includes traditional physical layer modules. This framework combines neural network-based semantic source encoders with traditional channel encoders, enabling effective protection of semantic information within existing communication systems. Additionally, the paper introduces an adaptive coding mechanism that provides unequal error protection to different modalities based on semantic importance, thereby minimizing inference delay while maintaining task performance.