Abstract:We consider the problem of semantic segmentation, i.e. assigning each pixel in an image to a set of pre-defined semantic object categories. State-of-the-art semantic segmentation algorithms typically consist of three components: a local appearance model, a local consistency model and a global consistency model. These three components are generally integrated into a unified probabilistic framework. While it enables at training time a joint estimation of the model parameters and while it ensures at test time a globally consistent labeling of the pixels, it also comes at a high computational cost.We propose a simple approach to semantic segmentation where the three components are decoupled (this journal submission is an extended version of the following conference paper: G. Csurka and F. Perronnin, “A simple high performance approach to semantic segmentation”, BMVC, 2008). For the local appearance model, we make use of the Fisher kernel. While this framework was shown to lead to high accuracy for image classification, to our best knowledge this is its first application to the segmentation problem. The semantic segmentation process is then guided by a low-level segmentation which enforces local consistency. Finally, to enforce image-level consistency we use global image classifiers: if an image as a whole is unlikely to contain an object class, then the corresponding class is not considered in the segmentation pipeline.The decoupling of the components makes our system very efficient both at training and test time. An efficient training enables to estimate the model parameters on large quantities of data. Especially, we explain how our system can leverage weakly labeled data, i.e. images for which we do not have pixel-level labels but either object bounding boxes or even only image-level labels.We believe that an important contribution of this paper is to show that even a simple decoupled system can provide state-of-the-art performance on the PASCAL VOC 2007, PASCAL VOC 2008 and MSRC 21 datasets.

Efficient Object Region Discovery for Weakly-supervised Semantic Segmentation

Omnisupervised Omnidirectional Semantic Segmentation

Deep Dual-Stream Network with Scale Context Selection Attention Module for Semantic Segmentation

STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation

Weakly Supervised Semantic Segmentation Based on Co-segmentation.

Weakly Supervised Instance Segmentation by Exploring Entire Object Regions

Weakly Supervised Semantic Segmentation Based on Web Image Co-segmentation

Lightweight semantic segmentation network with configurable context and small object attention

Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation

Box-driven Class-wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation

Fast-SegNet: fast semantic segmentation network for small objects

High-performance Semantic Segmentation Using Very Deep Fully Convolutional Networks

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs

Weakly-Supervised Point Cloud Semantic Segmentation Based on Dilated Region

Local structure consistency and pixel-correlation distillation for compact semantic segmentation

Weakly Supervised Semantic Segmentation via Box-Driven Masking and Filling Rate Shifting

Attention Guided Global Enhancement and Local Refinement Network for Semantic Segmentation

An Efficient Approach to Semantic Segmentation

Improving Semantic Segmentation via Efficient Self-Training

A novel seminar learning framework for weakly supervised salient object detection

Efficiently Expanding Receptive Fields: Local Split Attention and Parallel Aggregation for Enhanced Large-scale Point Cloud Semantic Segmentation