End-to-end multiview fusion for building mapping from aerial images

Qi Chen,Wenxiang Gan,Pengjie Tao,Penglei Zhang,Rongyong Huang,Lei Wang
DOI: https://doi.org/10.1016/j.inffus.2024.102498
IF: 18.6
2024-06-02
Information Fusion
Abstract:In the domain of photogrammetry, the fusion of information from multiple views holds the potential to significantly enhance the accuracy and robustness of building mapping. While multiview observation and stereoscopic imaging form the bedrock of photogrammetric projects, current deep learning methodologies predominantly focus on orthophotos and digital surface models (DSMs), often sidelining the rich multiview information inherent in original images. Addressing this gap, we present Multiview Mapper (MVMapper), an end-to-end learning framework explicitly crafted to harness and fuse the rich semantic information from original multiview images with object-space features. MVMapper uses stereo labels for supervised building segmentation in a dual-space, encompassing both image and object domains. Additionally, it incorporates a novel piecewise affine projection method for ensuring a robust image-to-object feature transformation. Experimental results on an aerial photogrammetric dataset with a resolution of 30 cm demonstrate MVMapper's superiority over state-of-the-art multiview data fusion methods, yielding significant improvements in segmentation and contour accuracy. Notably, the proposed piecewise affine projection method mitigates misalignment issues caused by DSM noise, enabling the effective fusion of multiview features with object-space features. Further experimentation on a separate open-source dataset demonstrates MVMapper's substantial advantages in transferability to other contexts.
computer science, artificial intelligence, theory & methods
What problem does this paper attempt to address?