A Novel Lightweight Attention-Discarding Transformer for High-Resolution SAR Image Classification

Xingyu Liu,Yan Wu,Xin Hu,Zhikang Li,Ming Li
DOI: https://doi.org/10.1109/lgrs.2023.3279456
IF: 5.343
2023-01-01
IEEE Geoscience and Remote Sensing Letters
Abstract:Vision transformer (ViT) has been introduced in high-resolution synthetic aperture radar (HR SAR) image classification due to its excellent global feature extraction ability. However, small samples of SAR images make it difficult to fit the ViT with excessive trainable parameters, which easily results in over-fitting in training. Meanwhile, poor capability in capturing local features of ViT limits its accuracy in SAR image classification. To solve these problems, this letter proposes a new lightweight attention-discarding transformer (LAD Transformer) for the classification of HR SAR images. In the proposed model, the backbone of the advanced Swin transformer is used to model global information and extract hierarchical features. Moreover, the vital feature extraction part of the LAD transformer completely discards the self-attention mechanism and extracts local features of SAR images by introducing lighter group convolution and channel shuffle (GC-CS) block. In addition, to address the estimation shift caused by consecutive batch normalization (BN) layers, a new composite normalization method consisting of BN and layer normalization (BLN) in GC-CS block is proposed. The experiments show that the proposed network has fewer parameters and higher classification accuracy on two real HR SAR data.
What problem does this paper attempt to address?