Weakly Supervised Building Extraction from High-Resolution Remote Sensing Images Based on Building-Aware Clustering and Activation Refinement Network

Daoyuan Zheng,Shaohua Wang,Haixia Feng,Shunli Wang,Mingyao Ai,Pengcheng Zhao,Jiayuan Li,Qingwu Hu
DOI: https://doi.org/10.1109/tgrs.2024.3438248
IF: 8.2
2024-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Weakly supervised building extraction methods, utilizing image-level labels, offer a cost-effective solution by significantly reducing the need for pixel-level annotation in high-resolution (HR) remote sensing (RS) images. These methods often focus on class activation map (CAM) optimization based on features extracted from individual images, missing out on the benefits of associating building features from multiple RS images (i.e., n images) to improve CAMs. This limitation leaves room for improvement in both CAM optimization and pseudo-mask generation. To address this, we propose the building-aware clustering and activation refinement network (BAC-AR-Net), a novel weakly supervised network to enhance weakly supervised building extraction performance. The building-aware clustering (BAC) module aggregates and clusters feature maps from multiple building samples to obtain common features of buildings. The common features are subsequently used to extract regions with similar building semantics, thereby enhancing the accuracy and completeness of building coverage in CAMs. Additionally, the activation refinement module is designed to generate pseudo-masks with clear boundaries and an effective separation of buildings and background. Experiments were conducted on the ISPRS Potsdam and Vaihingen datasets as well as a self-built building dataset to verify the effectiveness of our proposed method. The results show the proposed method outperforms both the weakly supervised semantic segmentation and weakly supervised building extraction methods that use image-level labels, achieving IoU accuracies of 0.8556, 0.8163, and 0.7797 on the respective datasets. This study introduces a novel weakly supervised learning framework to the RS application, with a particular focus on building extraction and semantic segmentation tasks.
What problem does this paper attempt to address?