A Unified Multi-Task Semantic Communication System for Multimodal Data

Guangyi Zhang,Qiyu Hu,Zhijin Qin,Yunlong Cai,Guanding Yu,Xiaoming Tao
2024-06-08
Abstract:Task-oriented semantic communications have achieved significant performance gains. However, the employed deep neural networks in semantic communications have to be updated when the task is changed or multiple models need to be stored for performing different tasks. To address this issue, we develop a unified deep learning-enabled semantic communication system (U-DeepSC), where a unified end-to-end framework can serve many different tasks with multiple modalities of data. As the number of required features varies from task to task, we propose a vector-wise dynamic scheme that can adjust the number of transmitted symbols for different tasks. Moreover, our dynamic scheme can also adaptively adjust the number of transmitted features under different channel conditions to optimize the transmission efficiency. Particularly, we devise a lightweight feature selection module (FSM) to evaluate the importance of feature vectors, which can hierarchically drop redundant feature vectors and significantly accelerate the inference. To reduce the transmission overhead, we then design a unified codebook for feature representation to serve multiple tasks, where only the indices of these task-specific features in the codebook are transmitted. According to the simulation results, the proposed U-DeepSC achieves comparable performance to the task-oriented semantic communication system designed for a specific task but with significant reduction in both transmission overhead and model size.
Signal Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to construct a unified multi - task semantic communication system in a multi - modal data environment, so as to reduce the need for model updates and the need to store multiple models when tasks change, thereby achieving efficient and general - purpose service provision. Specifically, existing deep - learning - based semantic communication systems can usually only handle single - task and single - modal data. When the task changes or different - modal data need to be processed, it is necessary to retrain the model or store different models, which is unrealistic on resource - limited edge devices. Therefore, the paper proposes a unified deep - learning - driven semantic communication system (U - DeepSC), aiming to process multiple tasks and multi - modal data through a fixed model framework while reducing transmission overhead and model size. To achieve this goal, the paper makes the following key contributions: 1. **Unified Semantic Communication Framework**: U - DeepSC is a general - purpose framework that can support the transmission of three - modal data: image, text, and voice. 2. **Dynamic Feature Selection Module (FSM)**: A lightweight feature selection module is designed, which can dynamically adjust the number of transmission features under different channel conditions to achieve an adaptive trade - off between transmission rate and task performance. The FSM can also hierarchically prune redundant feature vectors, reducing computational complexity and accelerating inference. 3. **Unified Codebook Design**: A unified codebook is developed for multi - task services, which supports digital communication and reduces transmission overhead. The codebook adopts a discrete feature representation, and only the indices of these encoded features in the codebook are transmitted. 4. **Unified Decoder**: A unified decoder is designed based on the Transformer decoder, introducing the masked cross - attention method to achieve parallel training. In addition, a two - stage training algorithm is proposed to learn multiple tasks simultaneously. Through these innovations, U - DeepSC significantly reduces transmission overhead and the number of model parameters while maintaining performance comparable to that of task - specific semantic communication systems.