Abstract:Though achieving marvelous progress in various scenarios, existing semantic communication frameworks mainly consider single-input single-output Gaussian channels or Rayleigh fading channels, neglecting the widely-used multiple-input multiple-output (MIMO) channels, which hinders the application into practical systems. One common solution to combat MIMO fading is to utilize feedback MIMO channel state information (CSI). In this paper, we incorporate MIMO CSI into system designs from a new perspective and propose the learnable CSI fusion semantic communication (LCFSC) framework, where CSI is treated as side information by the semantic extractor to enhance the semantic coding. To avoid feature fusion due to abrupt combination of CSI with features, we present a non-invasive CSI fusion multi-head attention module inside the Swin Transformer. With the learned attention masking map determined by both source and channel states, more robust attention distribution could be generated. Furthermore, the percentage of mask elements could be flexibly adjusted by the learnable mask ratio, which is produced based on the conditional variational interference in an unsupervised manner. In this way, CSI-aware semantic coding is achieved through learnable CSI fusion masking. Experiment results testify the superiority of LCFSC over traditional schemes and state-of-the-art Swin Transformer-based semantic communication frameworks in MIMO fading channels.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to achieve robust image semantic coding in multi - input multi - output (MIMO) fading channels. Existing semantic communication frameworks mainly consider single - input single - output Gaussian channels or Rayleigh fading channels, ignoring the widely - used MIMO channels, which limits their applications in practical systems. This paper proposes a new method to solve this problem by fusing MIMO channel state information (CSI) as side information into the semantic extractor to enhance the robustness of semantic coding. Specifically, the paper proposes a framework called "Learnable CSI Fusion Semantic Communication" (LCFSC). In this framework, CSI is used as side information and fused with features through the Non - Invasive CSI Fusion Multi - Head Attention (NI - CFMA) module. This module generates a more robust attention distribution by generating an attention mask map jointly determined by the source and the channel state. Moreover, the proportion of mask elements can be flexibly adjusted by a learnable mask ratio, which is generated in an unsupervised manner based on conditional variational interference. In this way, CSI - aware semantic coding is achieved. The main contributions of the paper include: 1. **LCFSC Framework**: A new framework is proposed, which efficiently integrates the feedback CSI in the 5G MIMO environment into the semantic communication system based on the Swin Transformer backbone, ensuring robust semantic coding. 2. **CSI - Fusion Masked Semantic Extractor**: A sophisticated CSI - fusion design is provided, replacing the classic multi - head attention module in Swin Transformer with the proposed non - invasive CSI - fusion multi - head attention module, generating a more robust attention distribution by masking unimportant and severely - interfered semantic elements. 3. **Learnable Mask Ratio**: A pre - processing stage is designed to adaptively control the masking proportion of semantic attention elements in NI - CFMA by generating a mask ratio suitable for the current state. 4. **Noise - Purifying Channel Estimator**: A noise - purifying channel estimator is proposed, which estimates the coarse CSI by the least - squares method and purifies it into a finer CSI based on noise - space projection. These innovations make the LCFSC framework exhibit superior performance in MIMO fading channels, with a performance improvement of more than 2 dB compared to existing works.

Robust Image Semantic Coding with Learnable CSI Fusion Masking over MIMO Fading Channels

Viewing the MIMO Channel As Sequence Rather Than Image: A Seq2Seq Approach for Efficient CSI Feedback

Viewing Channel as Sequence Rather Than Image: A 2-D Seq2Seq Approach for Efficient MIMO-OFDM CSI Feedback

Pilot-Free Semantic Communication over Multi-User Mimo Fading Channels

FSSC: Federated Learning of Transformer Neural Networks for Semantic Image Communication

Learned Image Transmission over MIMO Fading Channels

A Deep Learning Based Broadcast Approach for Image Semantic Communication over Fading Channels

Channel-Adaptive Wireless Image Semantic Transmission with Learnable Prompts

MIMO Channel as a Neural Function: Implicit Neural Representations for Extreme CSI Compression in Massive MIMO Systems

Semantic Image Transmission Based on CSI Feedback

Fusion-Based Multi-User Semantic Communications for Wireless Image Transmission over Degraded Broadcast Channels

Swin Transformer-Based Dynamic Semantic Communication for Multi-User with Different Computing Capacity

Mal-Net: Multi-Scale Feature Extraction and Attention Mechanism Lightweight Network for CSI Feedback

SCAN: Semantic Communication with Adaptive Channel Feedback

Dual-Propagation-Feature Fusion Enhanced Neural CSI Compression for Massive MIMO

Semantic Multi-Resolution Communications

An Effective Network with Discrete Latent Representation Designed for Massive MIMO CSI Feedback

CCA-Net: A Lightweight Network Using Criss-Cross Attention for CSI Feedback

DMCE: Diffusion Model Channel Enhancer for Multi-User Semantic Communication Systems

Adaptive CSI Feedback for Deep Learning-Enabled Image Transmission