CLANET: a cross-linear attention network for semantic segmentation of urban scenes remote sensing images
Chao Chen,Yurong Qian,Hui Liu,Guangqi Yang,Chao ChenYurong QianHui LiuGuangqi Yanga School of Software,Xinjiang University,Urumqi,Chinab Key Laboratory of Signal Detection and Processing in Xinjiang Uygur Autonomous Region,Urumqi,Chinac Key Laboratory of Software Engineering,Xinjiang University,Urumqi,Chinad College of Information Science and Engineering,Xinjiang University,Urumqi,China
DOI: https://doi.org/10.1080/01431161.2023.2284238
IF: 3.531
2023-11-28
International Journal of Remote Sensing
Abstract:Semantic segmentation of high-resolution remote sensing images is important in land cover classification, road extraction, building extraction, water extraction, etc. However, high-resolution remote-sensing images have a lot of details. Due to the fixed receptive field of convolution blocks, it is impossible to model the correlation of global features. In addition, complex fusion methods cannot integrate spatial and global context information. In order to solve these problems, this paper proposes a cross-linear attention network (CLANet) to capture spatial and context information in images. The structure consists of a spatial branch and a context branch. The spatial branch is constructed by stacked convolution to better capture spatial information. The context branch models the global information based on the transformer deformation module. In addition, to effectively fuse spatial and context information, this paper also designs a feature fusion module (FFM), which uses a cross-linear attention mechanism for feature aggregation. Finally, this paper conducts many experiments on the ISPRS Vaihingen and the ISPRS Potsdam datasets. Among them, 82.28% of mIoU achieves on the ISPRS Vaihingen dataset. The experimental results show that CLANet has better performance and effect than the methods in recent years.
imaging science & photographic technology,remote sensing