Improved segmentation under extreme imbalance towards full background images
Eduardo Rocha de Andrade,Levy Boccato
DOI: https://doi.org/10.1016/j.eswa.2024.124273
IF: 8.5
2024-05-30
Expert Systems with Applications
Abstract:With the advent of modern convolutional neural networks many computer vision tasks, including semantic segmentation, have witnessed great improvements. However, in real applications, a segmentation network may also need to cope with images with no associated object of interest, i.e., all pixels belong to the background class. These empty images promote stark imbalance between classes in the dataset, which may hinder the model's performance. In this work, we analyze this particular scenario and its implications. Also, we investigate the two most common solutions: single-stage segmentation and two-stage classification-segmentation pipelines. As our main contribution, we propose a set of modifications based on semi-global context and attention mechanisms that can be applied on the decoder of most state-of-the-art segmentation networks, which is specially-tailored to tackle the problem of semantic segmentation with full background images. In addition, we propose an auxiliary segmentation loss for foreground pixels, which brings additional improvements in IoU for extreme class-unbalanced cases and helps to stabilize the training process. Our proposals are comprehensively evaluated on two datasets of different characteristics, demonstrating consistent IoU gains of up to 15 and 25% against its best single- and two-stage competitors, respectively. Finally, we perform ablation studies to better understand the underlying mechanisms of the proposed approach. The source code will be available at https://github.com/arc144/improved-seg-background-images .
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science