BAFNet: Bilateral Attention Fusion Network for Lightweight Semantic Segmentation of Urban Remote Sensing Images

Wentao Wang,Xili Wang
2024-09-16
Abstract:Large-scale semantic segmentation networks often achieve high performance, while their application can be challenging when faced with limited sample sizes and computational resources. In scenarios with restricted network size and computational complexity, models encounter significant challenges in capturing long-range dependencies and recovering detailed information in images. We propose a lightweight bilateral semantic segmentation network called bilateral attention fusion network (BAFNet) to efficiently segment high-resolution urban remote sensing images. The model consists of two paths, namely dependency path and remote-local path. The dependency path utilizes large kernel attention to acquire long-range dependencies in the image. Besides, multi-scale local attention and efficient remote attention are designed to construct remote-local path. Finally, a feature aggregation module is designed to effectively utilize the different features of the two paths. Our proposed method was tested on public high-resolution urban remote sensing datasets Vaihingen and Potsdam, with mIoU reaching 83.20% and 86.53%, respectively. As a lightweight semantic segmentation model, BAFNet not only outperforms advanced lightweight models in accuracy but also demonstrates comparable performance to non-lightweight state-of-the-art methods on two datasets, despite a tenfold variance in floating-point operations and a fifteenfold difference in network parameters.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the challenges of applying large-scale semantic segmentation networks in scenarios with limited sample sizes and constrained computational resources. Specifically, in situations where network size and computational complexity are restricted, the model faces significant challenges in capturing long-range dependencies in images and recovering detailed information. To this end, the authors propose a lightweight bidirectional semantic segmentation network—Bidirectional Attention Fusion Network (BAFNet)—for efficiently segmenting high-resolution urban remote sensing images. BAFNet includes two paths: the dependency path and the remote-local path. The dependency path utilizes large kernel attention to capture long-range dependencies in images, while the remote-local path is constructed through multi-scale local attention and an efficient remote attention module. Finally, a feature aggregation module is designed to effectively utilize the different features from the two paths. Experimental results show that BAFNet achieves mIoU of 83.20% and 86.53% on the public high-resolution urban remote sensing datasets Vaihingen and Potsdam, respectively. It not only outperforms advanced lightweight models but also performs comparably to state-of-the-art non-lightweight methods, despite having ten times fewer floating-point operations and fifteen times fewer network parameters.