IQFormer: A Novel Transformer-Based Model with Multi-Modality Fusion for Automatic Modulation Recognition

Mingyuan Shao,Dingzhao Li,Shaohua Hong,Jie Qi,Haixin Sun
DOI: https://doi.org/10.1109/tccn.2024.3485118
IF: 6.359
2024-01-01
IEEE Transactions on Cognitive Communications and Networking
Abstract:The advent of modern communication systems has led to the widespread application of deep learning-based automatic modulation recognition (DL-AMR) in wireless communications. However, existing networks still cannot effectively capture the complex relationships between signals under low signal-to-noise (SNR) ratio conditions. This paper proposes an automatic modulation recognition (AMR) method using multi-modal hybrid neural networks, named IQFormer. It is based on I/Q signals and time-frequency (T-F) transform distribution matrix inputs. To capture the inherent connection between spatio-temporal and T-F features in cross-modal features, we design a Dynamic Fusion Embedding (DFE) module. Within this module, feature information from multiple modalities is dynamically aggregated during the embedding stage, resulting in semantically enriched token sequences. Moreover, we develop a staged Transformer block scheme that allows IQFormer to efficiently extract local and global features from embedded tokens at different scales using convolution and attention mechanisms. Experimental results on RadioML2016.10a, RadioML2016.10b and HisarMod2019.1 datasets demonstrate the superior performance of IQFormer compared to the state-of-the-art (SOTA) DL-AMR methods. Code is available at https://github.com/WestdoorSad/IQFormer.
What problem does this paper attempt to address?