CSA-UNet: Channel-Spatial Attention-Based Encoder–Decoder Network for Rural Blue-Roofed Building Extraction from UAV Imagery

Xu Shi,Hong Huang,Chunyu Pu,Yinming Yang,Jie Xue
DOI: https://doi.org/10.1109/lgrs.2022.3197319
IF: 5.343
2022-01-01
IEEE Geoscience and Remote Sensing Letters
Abstract:Building extraction is a critical part of remote-sensing (RS) image interpretation, and it is a popular research topic in the RS community. However, building extraction from RS images is a difficult task due to its various shape, size, and complex scene. The extracted feature of existing deep learning methods is a lack of discrimination, resulting in incomplete buildings and irregular boundaries. Most studies are mainly concentrated on urban areas, ignoring illegal blue-roofed building extraction in rural areas. To address the above-mentioned problems, a channel-spatial attention-based encoder-decoder network (CSA-UNet) is proposed for rural blue-roofed building extraction tasks from RS images. To extract the key areas of buildings, the CSA-UNet employed channel-spatial attention to the fused features of encoder and decoder for achieving discriminative and attentive features. At the same time, considering the problem of false-negative predictions, a joint loss function is designed by giving weight to positive samples to alleviate this problem and optimize the CSA-UNet model. Furthermore, blue-roofed buildings are a special type of illegal building, so we take blue-roofed buildings as an example to carry out related research. And a blue roof dataset termed UAVBlue is built through unmanned aerial vehicles (UAVs). Experimental results exhibit that the CSA-UNet is better than some state-of-the-art (SOTA) methods.
What problem does this paper attempt to address?