Robust Image Semantic Coding with Learnable CSI Fusion Masking over MIMO Fading Channels

Bingyan Xie,Yongpeng Wu,Yuxuan Shi,Wenjun Zhang,Shuguang Cui,Merouane Debbah
2024-05-30
Abstract:Though achieving marvelous progress in various scenarios, existing semantic communication frameworks mainly consider single-input single-output Gaussian channels or Rayleigh fading channels, neglecting the widely-used multiple-input multiple-output (MIMO) channels, which hinders the application into practical systems. One common solution to combat MIMO fading is to utilize feedback MIMO channel state information (CSI). In this paper, we incorporate MIMO CSI into system designs from a new perspective and propose the learnable CSI fusion semantic communication (LCFSC) framework, where CSI is treated as side information by the semantic extractor to enhance the semantic coding. To avoid feature fusion due to abrupt combination of CSI with features, we present a non-invasive CSI fusion multi-head attention module inside the Swin Transformer. With the learned attention masking map determined by both source and channel states, more robust attention distribution could be generated. Furthermore, the percentage of mask elements could be flexibly adjusted by the learnable mask ratio, which is produced based on the conditional variational interference in an unsupervised manner. In this way, CSI-aware semantic coding is achieved through learnable CSI fusion masking. Experiment results testify the superiority of LCFSC over traditional schemes and state-of-the-art Swin Transformer-based semantic communication frameworks in MIMO fading channels.
Information Theory,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve robust image semantic coding in multi - input multi - output (MIMO) fading channels. Existing semantic communication frameworks mainly consider single - input single - output Gaussian channels or Rayleigh fading channels, ignoring the widely - used MIMO channels, which limits their applications in practical systems. This paper proposes a new method to solve this problem by fusing MIMO channel state information (CSI) as side information into the semantic extractor to enhance the robustness of semantic coding. Specifically, the paper proposes a framework called "Learnable CSI Fusion Semantic Communication" (LCFSC). In this framework, CSI is used as side information and fused with features through the Non - Invasive CSI Fusion Multi - Head Attention (NI - CFMA) module. This module generates a more robust attention distribution by generating an attention mask map jointly determined by the source and the channel state. Moreover, the proportion of mask elements can be flexibly adjusted by a learnable mask ratio, which is generated in an unsupervised manner based on conditional variational interference. In this way, CSI - aware semantic coding is achieved. The main contributions of the paper include: 1. **LCFSC Framework**: A new framework is proposed, which efficiently integrates the feedback CSI in the 5G MIMO environment into the semantic communication system based on the Swin Transformer backbone, ensuring robust semantic coding. 2. **CSI - Fusion Masked Semantic Extractor**: A sophisticated CSI - fusion design is provided, replacing the classic multi - head attention module in Swin Transformer with the proposed non - invasive CSI - fusion multi - head attention module, generating a more robust attention distribution by masking unimportant and severely - interfered semantic elements. 3. **Learnable Mask Ratio**: A pre - processing stage is designed to adaptively control the masking proportion of semantic attention elements in NI - CFMA by generating a mask ratio suitable for the current state. 4. **Noise - Purifying Channel Estimator**: A noise - purifying channel estimator is proposed, which estimates the coarse CSI by the least - squares method and purifies it into a finer CSI based on noise - space projection. These innovations make the LCFSC framework exhibit superior performance in MIMO fading channels, with a performance improvement of more than 2 dB compared to existing works.