Abstract:Semantic segmentation is a fundamental problem in multimedia which requires delicate per-pixel predictions of object categories. Recently, many researchers strive to refine the pixel-wise feature with spatial -contextual information. However, many of them still neglect the invisible hand of cross- channel information which provides inherent semantics to facilitate the segmentation performance. On the one hand, in the feature extraction stage, enhancing informative channels and suppressing trivial ones contribute to the acquisition of valuable semantic features, and thus improving the segmentation accuracy. On the other hand, in the prediction stage, we can predict the complete objects more clearly by finding the connections and complements between different channels, which can also contribute to the pixel prediction. And based on this idea, we propose a novel Channel-Adaptive Network for semantic segmentation, which is capable of enhancing the features from the perspective of channels in both feature extraction stage and prediction stage. Specifically, we propose two modules: (i) the Comprehensive Information Channel Attention (CiCA) module that addresses the shortcomings of existing channel attention by learning both low and high frequency components within each channel for emphasizing the informative channels; (ii) the Inter-Channel Relationship Reasoning (iCRR) module which is applied on the top of the feature extractor to adaptively enhance the interdependent channels by mining the complementary associations between them. Besides, our Channel-Adaptive Network is highly flexible, with a plug-and-play design. Extensive experiments have demonstrated that our method achieves the state-of-the-art segmentation performance on three challenging datasets, including Cityscapes (82.1%), ADE20K (46.51%) and PASCAL Context (55.0%).

Attention-Guided Network for Semantic Video Segmentation

Attention Guided Global Enhancement and Local Refinement Network for Semantic Segmentation

CSANet for Video Semantic Segmentation with Inter-Frame Mutual Learning

Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion Utilizing a Multi-Scale Dilated Convolutional Pyramid

Semantic Image Segmentation with Improved Position Attention and Feature Fusion

Enhanced Memory Network for Video Segmentation

Capturing the Spatio-Temporal Continuity for Video Semantic Segmentation.

Dual Path Attention Net For Remote Sensing Semantic Image Segmentation

AANet: Adaptive Attention Networks for Semantic Segmentation of High-Resolution Remote Sensing Imagery

A Semantic Segmentation Algorithm Based on Improved Attention Mechanism

Category-Based Interactive Attention and Perception Fusion Network for Semantic Segmentation of Remote Sensing Images

Lightweight Attention Network for Very High-Resolution Image Semantic Segmentation

EANET: Efficient Attention-Augmented Network for Real-Time Semantic Segmentation.

A Semantic Segmentation Network Simulating the Ventral and Dorsal Pathways of the Cerebral Visual Cortex.

Learning Cross-Channel Representations for Semantic Segmentation

AMNet: Convolutional Neural Network embeded with Attention Mechanism for Semantic Segmentation

RELAXNet: Residual Efficient Learning and Attention Expected Fusion Network for Real-Time Semantic Segmentation

DSANet: Dilated Spatial Attention for Real-Time Semantic Segmentation in Urban Street Scenes.

Real-time semantic segmentation network based on parallel atrous convolution for short-term dense concatenate and attention feature fusion

A Synergistical Attention Model for Semantic Segmentation of Remote Sensing Images

Fusion target attention mask generation network for video segmentation