Abstract:Image parsing is vital for many high-level image understanding tasks. Although both parametric and non-parametric approaches have achieved remarkable success, many technical challenges still prevail for images containing things/objects with broad-coverage and high-variability, because it still lacks versatile and effective strategies to seamlessly integrate local-global features selection, contextual cues exploitation, spatial layout encoding, data-driven coherency exploration, and flexible accommodation of newly annotated labels. To ameliorate, this paper develops a novel automatic non-parametric image parsing method with advantages of both parametric and non-parametric methodologies by resorting to new modeling and inferring strategies. The originality of our new approach is to employ sparse-dense reconstruction as a latent learning model to conduct candidate-label probability analysis over multi-level local regions, and synchronously leverage context-specific local-global label confidence propagation and global semantic spatial-contextual cues to guide holistic scene parsing. Towards this goal, we devise several novel technical components to comprise a lightweight parsing framework, including local region representation integrating complementary features, anisotropic consistency propagation based on bi-harmonic distance metric, bottom-up label voting, semantic string generation of image-level spatial-contextual cues based on Hilbert space-filling curve, and co-occurrence priors analysis based on relaxed string matching algorithm, which collectively enable us to effectively combat the aforementioned obstinate problems. Moreover, we conduct comprehensive experiments on public benchmarks, and make extensive and quantitative evaluations with state-of-the-art methods, which demonstrate the advantages of our method in accuracy, versatility, flexibility, and efficiency.

Weakly-supervised Scene Parsing with Multiple Contextual Cues

Hierarchical Scene Parsing by Weakly Supervised Learning with Image Descriptions

Automatic non-parametric image parsing via hierarchical semantic voting based on sparse-dense reconstruction and spatial-contextual cues.

Weakly Supervised Scene Parsing with Point-Based Distance Metric Learning.

Scene Parsing from an MAP Perspective.

Human Parsing by Weak Structural Label

Weakly Supervised Graph Propagation Towards Collective Image Parsing

Deep Structured Scene Parsing by Learning with Image Descriptions

Weakly Supervised Image Parsing by Discriminatively Semantic Graph Propagation

Weakly-Supervised Image Parsing Via Constructing Semantic Graphs and Hypergraphs

Can Image-Level Labels Replace Pixel-Level Labels for Image Parsing

Can Image-Level Labels Replace Pixel-Level Labels for Image Parsing.

Joint Margin, Cograph, and Label Constraints for Semisupervised Scene Parsing from Point Clouds

Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck

Scene Parsing Via Integrated Classification Model and Variance-Based Regularization.

Scene Parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers

Weakly Supervised Semantic Segmentation with a Multiscale Model

Prior-structure Driven Weakly-supervised Learning for Fine-grained Human Parsing

Multi-layer Feature Aggregation for Deep Scene Parsing Models

Weaklier Supervised Semantic Segmentation with Only One Image Level Annotation Per Category.

Weakly Supervised Point Cloud Semantic Segmentation Based on Scene Consistency