Deep Spherical Superpixels

Rémi Giraud,Michaël Clément
2024-07-24
Abstract:Over the years, the use of superpixel segmentation has become very popular in various applications, serving as a preprocessing step to reduce data size by adapting to the content of the image, regardless of its semantic content. While the superpixel segmentation of standard planar images, captured with a 90° field of view, has been extensively studied, there has been limited focus on dedicated methods to omnidirectional or spherical images, captured with a 360° field of view. In this study, we introduce the first deep learning-based superpixel segmentation approach tailored for omnidirectional images called DSS (for Deep Spherical Superpixels). Our methodology leverages on spherical CNN architectures and the differentiable K-means clustering paradigm for superpixels, to generate superpixels that follow the spherical geometry. Additionally, we propose to use data augmentation techniques specifically designed for 360° images, enabling our model to efficiently learn from a limited set of annotated omnidirectional data. Our extensive validation across two datasets demonstrates that taking into account the inherent circular geometry of such images into our framework improves the segmentation performance over traditional and deep learning-based superpixel methods. Our code is available online.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the problem of superpixel segmentation for omnidirectional (or spherical) images. Specifically, the authors focus on how to utilize deep learning methods to generate superpixels that conform to spherical geometry, thereby improving the processing of 360° panoramic images. #### Background - **Standard Planar Images**: Existing superpixel segmentation methods are primarily designed for standard planar images (usually with a 90° field of view), and these methods have been extensively studied. - **Omnidirectional Images**: In contrast, there is less research on superpixel segmentation methods specifically for omnidirectional images (with a 360° field of view). Due to their unique spherical geometry, omnidirectional images exhibit distortions when projected onto a 2D plane, making traditional planar superpixel methods less effective for these types of images. #### Problems - **Limitations of Existing Methods**: Traditional superpixel segmentation methods cannot effectively handle the geometric distortions in omnidirectional images, leading to decreased segmentation performance. - **Data Scarcity**: Datasets for omnidirectional images are relatively scarce, and the annotation process is complex, limiting the application of deep learning methods. - **Adaptability of Deep Learning Methods**: Existing deep learning superpixel segmentation methods are mainly designed for standard planar images and have not been optimized for omnidirectional images. ### Solution - **Proposed Method**: The authors propose a deep learning method called Deep Spherical Superpixels (DSS) specifically for superpixel segmentation of omnidirectional images. - **Technical Details**: - **Spherical CNN Architecture**: Utilizes spherical convolutional neural networks (Spherical CNN) to handle the spherical geometry of omnidirectional images. - **Differentiable K-means Clustering**: Combines a differentiable K-means clustering algorithm to generate superpixels, ensuring that the superpixels conform to spherical geometry. - **Data Augmentation**: Introduces data augmentation techniques specific to 360° images to enhance the model's learning capability with limited annotated data. ### Main Contributions 1. **First Proposal**: This is the first deep learning superpixel segmentation method specifically designed for omnidirectional images. 2. **Data Augmentation**: Proposes data augmentation strategies suitable for 360° images and validates their effectiveness through ablation experiments. 3. **Comprehensive Evaluation**: Compares the DSS method with existing methods on multiple datasets, demonstrating superior performance in segmentation accuracy and contour consistency. 4. **Code Release**: Provides the source code of the method, facilitating further research and application by the research community. ### Conclusion By introducing a spherical CNN architecture and specific data augmentation strategies, the DSS method effectively handles the geometric distortions of omnidirectional images, generating high-quality superpixels, and surpassing existing traditional and deep learning methods in terms of segmentation accuracy and contour consistency.