Abstract:Autonomous vehicles clearly benefit from the expanded Field of View (FoV) of 360-degree sensors, but modern semantic segmentation approaches rely heavily on annotated training data which is rarely available for panoramic images. We look at this problem from the perspective of domain adaptation and bring panoramic semantic segmentation to a setting, where labelled training data originates from a different distribution of conventional pinhole camera images. To achieve this, we formalize the task of unsupervised domain adaptation for panoramic semantic segmentation and collect DensePASS - a novel densely annotated dataset for panoramic segmentation under cross-domain conditions, specifically built to study the Pinhole-to-Panoramic domain shift and accompanied with pinhole camera training examples obtained from Cityscapes. DensePASS covers both, labelled- and unlabelled 360-degree images, with the labelled data comprising 19 classes which explicitly fit the categories available in the source (i.e. pinhole) domain. Since data-driven models are especially susceptible to changes in data distribution, we introduce P2PDA - a generic framework for Pinhole-to-Panoramic semantic segmentation which addresses the challenge of domain divergence with different variants of attention-augmented domain adaptation modules, enabling the transfer in output-, feature-, and feature confidence spaces. P2PDA intertwines uncertainty-aware adaptation using confidence values regulated on-the-fly through attention heads with discrepant predictions. Our framework facilitates context exchange when learning domain correspondences and dramatically improves the adaptation performance of accuracy- and efficiency-focused models. Comprehensive experiments verify that our framework clearly surpasses unsupervised domain adaptation- and specialized panoramic segmentation approaches.

Laformer: Vision Transformer for Panoramic Image Semantic Segmentation

PASS: Panoramic Annular Semantic Segmentation

Can We PASS Beyond the Field of View? Panoramic Annular Semantic Segmentation for Real-World Surrounding Perception

DS-PASS: Detail-Sensitive Panoramic Annular Semantic Segmentation Through SwaftNet for Surrounding Sensing

Omnisupervised Omnidirectional Semantic Segmentation

Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos

Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation

Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation

Multi-source Domain Adaptation for Panoramic Semantic Segmentation

SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation

PanoFormer: Panorama Transformer for Indoor 360 Depth Estimation

Deformable Mamba for Wide Field of View Segmentation

Single Frame Semantic Segmentation Using Multi-Modal Spherical Images

DensePASS: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation with Attention-Augmented Context Exchange

Transfer beyond the Field of View: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation

Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers

Semantic Segmentation of Panoramic Images Using a Synthetic Dataset

Panoramic image semantic segmentation using channel attention-based HarDNet and distorted boundary learning

Complementary Bi-directional Feature Compression for Indoor 360° Semantic Segmentation with Self-distillation

Look at the Neighbor: Distortion-aware Unsupervised Domain Adaptation for Panoramic Semantic Segmentation

Both Style and Distortion Matter: Dual-Path Unsupervised Domain Adaptation for Panoramic Semantic Segmentation