Learning Semantic Segmentation with Query Points Supervision on Aerial Images

Santiago Rivier,Carlos Hinojosa,Silvio Giancola,Bernard Ghanem
2024-08-06
Abstract:Semantic segmentation is crucial in remote sensing, where high-resolution satellite images are segmented into meaningful regions. Recent advancements in deep learning have significantly improved satellite image segmentation. However, most of these methods are typically trained in fully supervised settings that require high-quality pixel-level annotations, which are expensive and time-consuming to obtain. In this work, we present a weakly supervised learning algorithm to train semantic segmentation algorithms that only rely on query point annotations instead of full mask labels. Our proposed approach performs accurate semantic segmentation and improves efficiency by significantly reducing the cost and time required for manual annotation. Specifically, we generate superpixels and extend the query point labels into those superpixels that group similar meaningful semantics. Then, we train semantic segmentation models supervised with images partially labeled with the superpixel pseudo-labels. We benchmark our weakly supervised training approach on an aerial image dataset and different semantic segmentation architectures, showing that we can reach competitive performance compared to fully supervised training while reducing the annotation effort. The code of our proposed approach is publicly available at: <a class="link-external link-https" href="https://github.com/santiago2205/LSSQPS" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of reducing the reliance on high-quality pixel-level annotated data in the semantic segmentation task of high-resolution satellite images. Specifically, traditional deep learning methods, although achieving significant progress in semantic segmentation, usually require a large amount of pixel-level annotated data, which is both expensive and time-consuming to obtain. To solve this problem, the paper proposes a weakly supervised learning algorithm that relies only on point annotations (instead of full mask labels) to train the semantic segmentation model. ### Main Contributions: 1. **Weakly Supervised Annotation Expansion**: A new method is proposed to expand point annotations to superpixel regions, generating partially annotated pseudo masks for training the semantic segmentation model. 2. **Weighted Mask Loss Function**: A new weighted mask loss function is introduced, which only calculates the loss of annotated pixels and balances the weight of each category. 3. **Performance Validation**: Extensive experiments were conducted on aerial image datasets to validate the effectiveness and competitiveness of the method under different model architectures. ### Key Steps of the Solution: 1. **Point Selection**: Users select query points as the sole source of supervision information. 2. **Superpixel Extraction**: The DAL-HERS algorithm is used to generate superpixel regions. 3. **Pseudo Mask Generation**: Point annotations are expanded to superpixel regions to generate partially annotated pseudo masks. 4. **Model Training**: The generated pseudo masks and the proposed weighted mask loss function are used to train the semantic segmentation model. Through these steps, the method can maintain high segmentation accuracy while reducing annotation efforts, achieving performance comparable to fully supervised methods in a weakly supervised setting.