Abstract:Co-occurrent visual patterns suggest that pixel relation modeling facilitates dense prediction tasks, which inspires the development of numerous context modeling paradigms, \emph{e.g.}, multi-scale-driven and similarity-driven context schemes. Despite the impressive results, these existing paradigms often suffer from inadequate or ineffective contextual information aggregation due to reliance on large amounts of predetermined priors. To alleviate the issues, we propose a novel \textbf{I}ntervention-\textbf{D}riven \textbf{R}elation \textbf{Net}work (\textbf{IDRNet}), which leverages a deletion diagnostics procedure to guide the modeling of contextual relations among different pixels. Specifically, we first group pixel-level representations into semantic-level representations with the guidance of pseudo labels and further improve the distinguishability of the grouped representations with a feature enhancement module. Next, a deletion diagnostics procedure is conducted to model relations of these semantic-level representations via perceiving the network outputs and the extracted relations are utilized to guide the semantic-level representations to interact with each other. Finally, the interacted representations are utilized to augment original pixel-level representations for final predictions. Extensive experiments are conducted to validate the effectiveness of IDRNet quantitatively and qualitatively. Notably, our intervention-driven context scheme brings consistent performance improvements to state-of-the-art segmentation frameworks and achieves competitive results on popular benchmark datasets, including ADE20K, COCO-Stuff, PASCAL-Context, LIP, and Cityscapes. Code is available at \url{<a class="link-external link-https" href="https://github.com/SegmentationBLWX/sssegmentation" rel="external noopener nofollow">this https URL</a>}.

RelationNet: Learning Deep-Aligned Representation for Semantic Image Segmentation

Dense Relation Network: Learning Consistent and Context-Aware Representation for Semantic Image Segmentation

IIE-SegNet: Deep Semantic Segmentation Network With Enhanced Boundary Based on Image Information Entropy

A Deep Semantic Segmentation Network with Semantic and Contextual Refinements

Attention Guided Global Enhancement and Local Refinement Network for Semantic Segmentation

Learning Cross-Channel Representations for Semantic Segmentation

HDNet: Hybrid Distance Network for semantic segmentation

SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from Monocular Images

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

IDRNet: Intervention-Driven Relation Network for Semantic Segmentation

Image semantic segmentation based on improved DeepLabv3+ network and superpixel edge optimization

Semantic Image Segmentation with Improved Position Attention and Feature Fusion

A Top-Down Manner-Based DCNN Architecture for Semantic Image Segmentation.

NeiEA-NET: Semantic segmentation of large-scale point cloud scene via neighbor enhancement and aggregation

Semantic segmentation for remote sensing images via dense feature extraction and companion loss neural network

Long and short-range relevance context network for semantic segmentation

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs

High-performance Semantic Segmentation Using Very Deep Fully Convolutional Networks

Semantic Segmentation of Aerial Imagery Via Split-Attention Networks with Disentangled Nonlocal and Edge Supervision

RSI-Net: Two-Stream Deep Neural Network for Remote Sensing Images-Based Semantic Segmentation

Discriminative Features Reconstruction Network For Semantic Segmentation