Application of mask R-CNN for building detection in UAV remote sensing images

Tao Hou,Jing Li
DOI: https://doi.org/10.1016/j.heliyon.2024.e38141
IF: 3.776
2024-09-19
Heliyon
Abstract:This study aims to tackle the challenges of low accuracy in building feature extraction and insufficient details in three-dimensional (3D) modeling faced by traditional methods, particularly in complex backgrounds. To address these issues, a method for building feature extraction based on Mask Region-Convolutional Neural Network (Mask R-CNN) is proposed. This approach combines deep learning techniques with aerial images to ensure precise and efficient automatic detection and feature extraction. Urban building images are captured through aerial photography, and building outlines are annotated to create a comprehensive dataset of building features. The Mask R-CNN-based method efficiently processes and classifies the features of the dataset, generating candidate regions for further analysis. Additionally, this method demonstrates significant advantages in building feature extraction by employing the Mask R-CNN model to generate adaptive features. Comparative analysis with models such as Convolutional Neural Network (CNN), Region-based Convolutional Neural Network (R-CNN), Fast Region-based Convolutional Neural Network (Fast R-CNN), Faster Region-based Convolutional Neural Network (Faster R-CNN), and Generative Adversarial Network (GAN) indicates that Mask R-CNN exhibits superior performance in building feature extraction. The Mask R-CNN-based approach achieved approximately 95 % classification accuracy, while also showcasing strong stability and generalization capabilities. This study provides new methodologies and insights for enhancing feature extraction in aerial building imagery, offering significant reference value for the fields of architectural design and urban planning.
What problem does this paper attempt to address?