Coarse-to-fine Semantic Segmentation of Satellite Images
Hao Chen,Wen Yang,Li Liu,Gui-Song Xia
DOI: https://doi.org/10.1016/j.isprsjprs.2024.07.028
IF: 12.7
2024-01-01
ISPRS Journal of Photogrammetry and Remote Sensing
Abstract:Training deep neural networks for semantic segmentation of aerial images relies heavily on obtaining a large number of precise pixel-level annotations, which can cause significant annotation expenses. Given the fact that acquiring fine-class annotations is considerably more challenging than obtaining coarse-class annotations, we present a novel semi-supervised learning framework, which utilizes high spatial resolution images annotated with coarse-class labels alongside a very small set of fine-grained annotated images as the training set, thereby achieving classification results that are refined in both spatial resolution and categorical granularity. Specifically, this framework adopts Mix Transformer (MiT) as the backbone architecture to accommodate both local feature extraction and long-range dependency modeling capabilities and utilizes multi-prototype learning to model each class as multiple sub-prototypes, preserving the intrinsic variance characteristics within classes. We propose a dedicated co-training approach tailored for extracting fine-grained pseudo-labels from coarse-grained samples. In this approach, a local-softmax pseudo-labeling strategy is developed to ensure a harmonious balance between the efficiency and accuracy of the pseudo-labeling, and four losses are formulated for both single-level class and cross-category granularity supervised learning. We evaluate the proposed framework on the Gaofen Image Dataset (GID) and Five-Billion-Pixels (FBP) dataset, confirming its feasibility and superior results. In particular, based on coarse-class annotations, the performance achieved using only 5% of fine-class labels, in terms of the four metrics, namely mIoU, mean UA, mean F1-score, and OA, reached 91%, 96%, 89%, and 93% of the fully-supervised baseline performance respectively. The code is available at https://github.com/chenhaocs/C2F.