Dual Super-Resolution Learning for Semantic Segmentation

Li Wang,Dong Li,Yousong Zhu,Lu Tian,Yi Shan

DOI: https://doi.org/10.1109/cvpr42600.2020.00383

2020-06-01

Abstract:Current state-of-the-art semantic segmentation methods often apply high-resolution input to attain high performance, which brings large computation budgets and limits their applications on resource-constrained devices. In this paper, we propose a simple and flexible two-stream framework named Dual Super-Resolution Learning (DSRL) to effectively improve the segmentation accuracy without introducing extra computation costs. Specifically, the proposed method consists of three parts: Semantic Segmentation Super-Resolution (SSSR), Single Image Super-Resolution (SISR) and Feature Affinity (FA) module, which can keep high-resolution representations with low-resolution input while simultaneously reducing the model computation complexity. Moreover, it can be easily generalized to other tasks, e.g., human pose estimation. This simple yet effective method leads to strong representations and is evidenced by promising performance on both semantic segmentation and human pose estimation. Specifically, for semantic segmentation on CityScapes, we can achieve ≥ 2% higher mIoU with similar FLOPs, and keep the performance with 70% FLOPs. For human pose estimation, we can gain ≥ 2% mAP with the same FLOPs and maintain mAP with 30% fewer FLOPs. Code and models are available at https://github.com/wanglixilinx/DSRL.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to perform efficient and high - performance semantic segmentation on resource - constrained devices. Specifically, the current state - of - the - art semantic segmentation methods usually require high - resolution inputs to achieve high performance, which brings huge computational costs and limits the application of these methods on resource - constrained devices. The paper proposes a simple and flexible two - stream framework - Dual Super - Resolution Learning (DSRL), aiming to effectively improve the segmentation accuracy without incurring additional computational costs. This framework consists of three parts: Semantic Segmentation Super - Resolution (SSSR), Single Image Super - Resolution (SISR) and Feature Affinity (FA). Through these components, DSRL can maintain high - resolution representations while keeping low - resolution inputs and reduce the computational complexity of the model. In addition, this method can be easily extended to other tasks, such as human pose estimation. The experimental results show that this method performs well in both semantic segmentation and human pose estimation tasks, and can significantly improve performance while reducing the amount of computation. Specifically, on the CityScapes dataset, using DSRL can achieve at least 2% higher mIoU than the baseline method, and can maintain performance with a similar amount of computation; for the human pose estimation task, it can obtain at least a 2% mAP improvement with the same amount of computation and maintain performance with a 30% reduction in the amount of computation.

Dual Super-Resolution Learning for Semantic Segmentation

Deep Dual-Stream Network with Scale Context Selection Attention Module for Semantic Segmentation

Super-Resolution Based Patch-Free 3D Medical Image Segmentation with Self-Supervised Guidance

Multi-Resolution Learning and Semantic Edge Enhancement for Super-Resolution Semantic Segmentation of Urban Scene Images

A Dual Network for Super-Resolution and Semantic Segmentation of Sentinel-2 Imagery

Collaborative Network for Super-Resolution and Semantic Segmentation of Remote Sensing Images

Semantic Segmentation Prior for Diffusion-Based Real-World Super-Resolution

Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes

ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-high Resolution Segmentation

DWRSeg: Rethinking Efficient Acquisition of Multi-scale Contextual Information for Real-time Semantic Segmentation

Advancing high-resolution remote sensing: a compact and powerful approach to semantic segmentation

Dynamic Dual Sampling Module for Fine-Grained Semantic Segmentation.

DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network

Improving Semi-Supervised Semantic Segmentation with Dual-Level Siamese Structure Network

Learning Dual Multi-Scale Manifold Ranking for Semantic Segmentation of High-Resolution Images

Research on an Intelligent Driving Algorithm Based on the Double Super-Resolution Network

Deep Dual-Resolution Networks for Real-Time and Accurate Semantic Segmentation of Traffic Scenes

Dual semantic-guided model for weakly-supervised zero-shot semantic segmentation

Detail-Optimized Super-Resolution Reconstruction-Based Multistage Training Strategy for Remote Sensing Semantic Segmentation

Double Similarity Distillation for Semantic Image Segmentation

A bidirectional semantic segmentation method for remote sensing image based on super-resolution and domain adaptation