AAFormer: Attention-Attended Transformer for Semantic Segmentation of Remote Sensing Images

Xin Li,Feng Xu,Linyang Li,Nan Xu,Fan Liu,Chi Yuan,Ziqi Chen,Xin Lyu
DOI: https://doi.org/10.1109/lgrs.2024.3397851
IF: 5.343
2024-05-18
IEEE Geoscience and Remote Sensing Letters
Abstract:The rapid advancements in remote sensing technology have enabled the widespread availability of fine-resolution remote sensing images (RSIs), offering rich spatial details and semantics. Despite the applicability and scalability of transformers in semantic segmentation of RSIs by learning pairwise contextual affinity, they inevitably introduce irrelevant context, hindering accurate inference of patch semantics. To address this, we propose a novel multihead attention-attended module (AAM) that refines the multihead self-attention mechanism (AM). The AAM filters out irrelevant context while highlighting informative ones by considering the relevance between self-attention maps and the query vector. The AAM generates an attention gate to complement contextual affinity and emphasize the useful ones with a higher weight simultaneously. Leveraging multihead AAM as the core unit, we construct a lightweight attention-attended transformer block (ATB). Subsequently, we devise AAFormer, a pure transformer with a mask transformer decoder, for achieving semantic segmentation of RSIs. We extensively evaluate our approach on the ISPRS Potsdam and LoveDA datasets, demonstrating compelling performance compared to mainstream methods. Additionally, we conduct evaluations to analyze the effects of AAM.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?