SNNFD, Spiking Neural Segmentation Network in Frequency Domain Using High Spatial Resolution Images for Building Extraction.

Bo Yu,Aqiang Yang,Fang Chen,Ning Wang,Lei Wang
DOI: https://doi.org/10.1016/j.jag.2022.102930
IF: 7.5
2022-01-01
International Journal of Applied Earth Observation and Geoinformation
Abstract:Up-to-date building maps are fundamental to urban development and analysis. However, detecting buildings from images with different spatial resolutions and ground object patterns from various imaging sensors is a challenge. Current models mostly have difficulties in extracting buildings with poor boundaries due to the various building appearances and sizes. Moreover, most published methods are trained and evaluated on subsets from the same dataset whose images are captured from one imaging sensor with similar ground object patterns, making it difficult to evaluate the transferability objectively. To address this issue, a spiking neural network in the frequency domain (SNNFD) is proposed to enhance the model transferability and the feature capability of buildings with different sizes by synthesizing frequency domain and spatial domain learning. Spiking convolution is adopted in the frequency learning module to enhance the model learning ability by mimicking the learning process of human brain. The learned frequency features are concatenated and transformed to the spatial domain, and used to generate building-extraction result images by convolution networks. SNNFD is evaluated on two datasets with different spatial resolutions (0.3–2.5 m) from different imaging sensors (Quickbird, Worldview, IKONOS, ZY-3) of different study areas (worldwide). It is compared with five recently proposed semantic segmentation frameworks (Unet, Segnet, DeepLabv3, BiSeNet, F3-Net), and obtains a minimum of 6.33 % higher accuracy with a strong transferability in detecting different sizes of building instances. Specifically, the proposed model improves the segmentation performance of small building instances by at least 6.5 % compared with the five segmentation frameworks through synthesis of spiking convolution in the frequency learning domain in the model construction. Moreover, details of building boundaries are better maintained by SNNFD, offering the possibility of detecting buildings for practical applications.
What problem does this paper attempt to address?