HR-UVFormer: A Top-Down and Multimodal Hierarchical Extraction Approach for Urban Villages
Xin Tan,Qingyan Meng,Fei Zhao,Linlin Zhang,Xinli Hu,Tamás Jancsó
DOI: https://doi.org/10.1109/tgrs.2024.3387022
IF: 8.2
2024-04-30
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Urban villages (UVs) renovation has been incorporated into the Sustainable Development Goals (SDGs) as a result of the inequality issue among residents garnering substantial social attention. However, existing deep-learning techniques for UVs extraction have been limited to a single spatial scale (e.g., patch-level or pixel-level extraction), leading to inadequate precision and integrity in their extraction outcomes. To overcome this limitation, our study introduces HR-UVFormer, a top-down and multimodal hierarchical extraction approach that extracts UVs from a coarse scale (patch) to a fine granularity (pixel), aiming to enhance the internal completeness and boundary accuracy of the extraction results. The multimodal approach can effectively fuse multimodal features [e.g., building footprints (BFs)] with remote sensing images (RSIs) to enhance UVs extraction. The Shenzhen results indicate that the coarse-scale extraction accuracy achieves an overall accuracy (OA) of 98.79%, and the fine-grained extraction accuracy achieves a mean Intersection over Union (mIoU) of 93.60%. Furthermore, ablation experiments demonstrate a notable 7.14% improvement in mIoU with the hierarchical extraction strategy compared to the traditional pixel-based extraction strategy, and the fusion of BF and RSI yields further improvements of 2.78% and 0.65% in OA and mIoU, respectively. This finding confirms the synergistic effect between RSI and BF in UVs extraction, which has been further analyzed in this study. In addition, the proposed model outperforms other deep learning models and exhibits the potential to support more modal features (e.g., POI). Finally, the experimental dataset and code can be publicly accessed at https://github.com/q1310546582/ HR-UVFormer-code.
engineering, electrical & electronic,imaging science & photographic technology,remote sensing,geochemistry & geophysics