BFA-YOLO: A balanced multiscale object detection network for building façade attachments detection

Yangguang Chen,Tong Wang,Guanzhou Chen,Kun Zhu,Xiaoliang Tan,Jiaqi Wang,Wenchao Guo,Qing Wang,Xiaolong Luo,Xiaodong Zhang
2024-11-11
Abstract:The detection of façade elements on buildings, such as doors, windows, balconies, air conditioning units, billboards, and glass curtain walls, is a critical step in automating the creation of Building Information Modeling (BIM). Yet, this field faces significant challenges, including the uneven distribution of façade elements, the presence of small objects, and substantial background noise, which hamper detection accuracy. To address these issues, we develop the BFA-YOLO model and the BFA-3D dataset in this study. The BFA-YOLO model is an advanced architecture designed specifically for analyzing multi-view images of façade attachments. It integrates three novel components: the Feature Balanced Spindle Module (FBSM) that tackles the issue of uneven object distribution; the Target Dynamic Alignment Task Detection Head (TDATH) that enhances the detection of small objects; and the Position Memory Enhanced Self-Attention Mechanism (PMESA), aimed at reducing the impact of background noise. These elements collectively enable BFA-YOLO to effectively address each challenge, thereby improving model robustness and detection precision. The BFA-3D dataset, offers multi-view images with precise annotations across a wide range of façade attachment categories. This dataset is developed to address the limitations present in existing façade detection datasets, which often feature a single perspective and insufficient category coverage. Through comparative analysis, BFA-YOLO demonstrated improvements of 1.8\% and 2.9\% in mAP$_{50}$ on the BFA-3D dataset and the public Façade-WHU dataset, respectively, when compared to the baseline YOLOv8 model. These results highlight the superior performance of BFA-YOLO in façade element detection and the advancement of intelligent BIM technologies.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to solve several key problems in the detection of building facade attachments, specifically including: 1. **Uneven category distribution**: The distribution of building facade attachments (such as doors, windows, balconies, air - conditioning units, etc.) in images is extremely uneven. The number of attachments in some categories is small, making it difficult for the model to effectively learn the characteristics of these categories during training. 2. **Difficulties in small - target detection**: Some attachments (such as small windows and air - conditioning units) are small in size and difficult to be accurately identified in large - scale images, which poses a challenge to the detection accuracy. 3. **Background noise interference**: The complex background of the building facade (such as wall textures, shadows, etc.) will interfere with the target detection and reduce the detection accuracy. To solve the above problems, the author proposes the BFA - YOLO model and the supporting BFA - 3D dataset. The following are the specific solutions: - **BFA - 3D dataset**: A multi - view, accurately - labeled dataset is constructed, covering various types of building facade attachments, aiming to overcome the problems of single view and insufficient classification diversity in existing datasets. - **BFA - YOLO model**: Three innovative modules are introduced to improve the detection performance: - **Feature - Balanced Spindle Module (FBSM)**: The detection ability of sparse category features is enhanced through resampling techniques, improving the accuracy of category identification. - **Target - Dynamic - Alignment - Task - Detection - Head (TDATH)**: Specially designed for accurately identifying small targets, improving the detection effect of small - size attachments. - **Position - Memory - Enhanced - Self - Attention - Mechanism (PMESA)**: By introducing position information, the interference of complex background is reduced, and the detection accuracy is significantly improved. These improvements make the BFA - YOLO model perform excellently in the building facade attachment detection task, especially in dealing with category imbalance, small - target detection and background interference. Experimental results show that BFA - YOLO is superior to existing mainstream detection models in multiple evaluation indicators.