Remote sensing building damage assessment with a multihead neighbourhood attention transformer

Chen Yu,Bin Hu,Xiuchuan Cheng,Guangqiang Yin,Zhiguo Wang
DOI: https://doi.org/10.1080/01431161.2023.2242590
IF: 3.531
2023-08-13
International Journal of Remote Sensing
Abstract:Most existing remote sensing disaster assessment methods rely on convolutional neural networks (CNNs). Although CNNs can extract effective semantic features, determining global spatial relationships remains limited due to the locality of convolutional operations. The recently developed transformer-based method can extract global information from images effectively by encoding image tokens. However, its consumption of computational resources varies markedly with increasing image resolution. In this paper, we propose a novel transformer-based neural network for disaster assessment problems. The network uses a multihead neighbourhood attention (MNA) transformer as the base layer of the encoder to achieve more efficient self-attention computation. In addition, the bitemporal feature fusion module (BFFM) performs differential enhancement and injects the change information to the decoder via skip connections. The multiscale tokenizer generates multiscale image tokens to mitigate the loss of detail during encoding. Experimental results on three datasets show that the proposed method outperforms existing methods.
imaging science & photographic technology,remote sensing
What problem does this paper attempt to address?