High Resolution SAR Image Classification Using Global-Local Network Structure Based on Vision Transformer and CNN

Xingyu Liu,Yan Wu,Wenkai Liang,Yice Cao,Ming Li
DOI: https://doi.org/10.1109/lgrs.2022.3151353
IF: 5.343
2022-01-01
IEEE Geoscience and Remote Sensing Letters
Abstract:High-resolution (HR) synthetic aperture radar (SAR) image classification is a challenging task for the limitation of its complex semantic scenes and coherent speckles. Convolutional neural networks (CNNs) have been proven the superior local spatial features representation capability for SAR images. However, it is hard to capture global information of images by convolutions. To solve such issues, this letter proposes an end-to-end network named global–local network structure (GLNS) for HR SAR classification. In the GLNS framework, a lightweight CNN and a compact vision transformer (ViT) are designed to learn local and global features, and two types of features are fused in quality to mine complementary information through the fusion net. Then, our research devolves the twofold loss function to reduce the interclass distance of SAR images, which brings more compactness to classification features and less interference of coherent speckles. Experimental results on real HR SAR images indicate that the proposed method has more strong feature extraction capability and noise resistance performance. This method achieves the highest classification accuracy on both datasets compared with other related approaches based on CNN.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?