Abstract:As a fundamental task for geographical information updating, 3D city modeling, and other critical applications, the automatic extraction of building footprints from high-resolution remote sensing images has been substan-tially explored and received increasing attention over recent years. Among different types of building extraction methods, the polygonal segmentation methods produce vector building polygons that are in a more realistic format compared with those obtained from pixel-wise semantic labeling and contour-based methods. However, existing polygonal building segmentation methods usually require a perfect segmentation map and a complex post-processing procedure to guarantee the polygonization quality, or produce inaccurate vertex prediction results that suffer from wrong vertex sequence, self-intersections, fixed vertex quantity, etc. In our previous work, we have proposed a method for polygonal building segmentation from remote sensing images that addresses the above limitations of existing methods. In this paper, we propose PolyCity, which further extends and improves our previous work in terms of the application scenario, methodology design, and experimental results. Our proposed PolyCity contains the following three components: (1) a pixel-wise multi-task network for learning the semantic and geometric information via three tasks, i.e., building segmentation, boundary prediction, and edge orientation prediction; (2) a simple but effective vertex selection module (VSM), which effectively bridges the gap between pixel-wise and graph-based models via transforming the segmentation map into valid polygon vertices; (3) a graph-based vertex refinement network (VRN) for automatically adjusting the coordinates of VSM-generated valid polygon vertices, producing the final building polygons with more precise vertices. Results on three large-scale building extraction datasets demonstrate that our proposed PolyCity generates vector building footprints with more accurate vertices, edges, shapes, etc., achieving significant vertex score improvements while maintaining high segmentation and boundary scores compared with the current state-of-the-art. The code of PolyCity will be released at https://github.com/liweijia/polycity.

HiT: Building Mapping With Hierarchical Transformers

PolyBuilding: Polygon transformer for building extraction

PolyBuilding: Polygon Transformer for End-to-End Building Extraction

From lines to Polygons: Polygonal building contour extraction from High-Resolution remote sensing imagery

BuildMapper: A Fully Learnable Framework for Vectorized Building Contour Extraction

Robust Extraction of Vectorized Buildings Via Bidirectional Tracing of Keypoints from Remotely Sensed Imagery.

Cross-level and multiscale CNN-Transformer network for automatic building extraction from remote sensing imagery

Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers

HAFormer: Unleashing the Power of Hierarchy-Aware Features for Lightweight Semantic Segmentation

HiTPR: Hierarchical Transformer for Place Recognition in Point Cloud

STransU2Net: Transformer based hybrid model for building segmentation in detailed satellite imagery

DHP-Mapping: A Dense Panoptic Mapping System with Hierarchical World Representation and Label Optimization Techniques

Iterative Polygon Deformation for Building Extraction

HiFT: Hierarchical Feature Transformer for Aerial Tracking

Multi-Resolution Transformer Network for Building and Road Segmentation of Remote Sensing Image

Building Extraction With Vision Transformer

C1 dissociation. Spontaneous generation in human serum of a trimer complex containing C1 inactivator, activated C1r, and zymogen C1s.

HoMap: End-to-End Vectorized HD Map Construction with High-order Modeling

Joint semantic–geometric learning for polygonal building segmentation from high-resolution remote sensing images

A Dual-Branch Fusion Network Based on Reconstructed Transformer for Building Extraction in Remote Sensing Imagery

Joint semantic-geometric learning for polygonal building segmentation from-resolution remote