Disentangled Foreground-Semantic Adapter Network for Generalized Aerial Image Few-Shot Semantic Segmentation
Qixiong Wang,Jihao Yin,Hongxiang Jiang,Jiaqi Feng,Guangyun Zhang
DOI: https://doi.org/10.1109/tgrs.2024.3455427
IF: 8.2
2024-10-04
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Semantic segmentation of remote sensing imagery requires extensive annotated samples for training, facing challenges in adapting to novel classes with few annotations. Few-shot semantic segmentation (FS-Seg) employs a support-to-query paradigm, which encounters many practical constraints. Recently, generalized few-shot semantic segmentation (GFS-Seg) has been proposed to align with general semantic segmentation paradigms, enabling the segmentation of all classes (both base and novel classes) in the image. However, existing GFS-Seg methods struggle with a large intra-class variance of background, degradation on base classes, and overfitting on novel classes during fine-tuning in aerial imagery. To address the above issues, we propose the disentangled foreground-semantic adapter network (DFSA-Net) for generalized aerial image FS-Seg. Specifically, to reduce the interference from background features, DFSA-Net employs a foreground-semantic decoder (FSD) to decompose semantic segmentation into foreground aggregation and multiclass refinement. To mitigate the base classes degradation and novel class overfitting during fine-tuning, we propose disentangled low-rank adapter (DLA) for fine-tuning phase, designed to preserve the base parameters while ensuring efficient adaptation to novel classes. Finally, we introduce an inference ensemble strategy that merges base and novel decoder prediction to achieve final output. Experimental results on NWPU and iSAID datasets demonstrate the superiority of our DFSA-Net over other compared methods.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics