MF-BHNet: A Hybrid Multimodal Fusion Network for Building Height Estimation Using Sentinel-1 and Sentinel-2 Imagery

Siyuan Wang,Bowen Cai,Dongyang Hou,Qing Ding,Jiaming Wang,Zhenfeng Shao
DOI: https://doi.org/10.1109/tgrs.2024.3477588
IF: 8.2
2024-10-25
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Integrated Sentinel-1 synthetic aperture radar (SAR) imagery and Sentinel-2 optical imagery have shown great promise in mapping large-scale building height. Effectively fusing the complementary features of SAR and optical imagery is a key challenge in enhancing the building height estimation performance. However, SAR imagery and optical imagery have significant heterogeneity, which makes obtaining accurate building height a challenging problem. In this article, we propose a hybrid multimodal fusion network (MF-BHNet) for building height estimation using Sentinel-1 SAR imagery and Sentinel-2 optical imagery. First, we design a hybrid multimodal encoder to mine modal-specific feature and model intermodal correlation. In particular, an intramodal encoder (IME) is designed to reconstruct valuable intramodal information, and a transformer-based cross-modal encoder (CME) is used to model intermodal correlation and capture contextual information. Then, a coarse-fine progressive multimodal fusion method is proposed to fuse SAR feature and optical feature to improve the building height estimation performance. We construct a building height dataset by introducing superior building footprints to validate our method. Experimental results demonstrate that our MF-BHNet method outperforms the compared 11 state-of-the-art methods, which achieves the lowest root-mean-square error (RMSE) of 3.6421 m. Besides, compared to the four publicly available building height products, the mapping result of the proposed method has significant advantages in terms of spatial detail and accuracy.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics
What problem does this paper attempt to address?