Abstract:Semantic segmentation technique plays a crucial role in Internet of Things applications, such as industrial robotics and self-driving. Recently deep learning approaches have boosted semantic segmentation accuracy greatly. However, their comprehensive performance in terms of accuracy and efficiency is still far from satisfactory. We observe that (1) accuracy-oriented methods rely on numerous convolution layers and sophisticated architectures, which result in heavy computational complexity and usually take a long time for inference; (2) efficiency-oriented methods fail to capture the multiscale context information for discriminative representations during the feature fusion process, thus leading to suboptimal performance. Previous semantic segmentation approaches fail to address these two challenges simultaneously. To tackle the dilemma of precise segmentation and efficient inference, we propose a novel lightweight Multiscale Information Fusion Network (MIFNet). Specifically, the proposed MIFNet mainly consists of two core components, that is, Pyramid Refinement Connection Module (PRCM) and Lightweight Information Fusion Module (LIFM). The PRCM exploits skip learning to establish dependency between different stages. Meanwhile, the pyramid attention mechanism (PAM) in PRCM, which adjusts the weight of hybrid pyramid attention vector to refine spatial features of low-level, is developed to alleviate the semantic gap. Moreover, the LIFM is designed to detect objects at multiple scales from the global-local perspective. In LIFM, the proposed multiscale dense concatenation (MDC) adopts various dilated convolution to extract multiscale local context information. Extensive experimental results on benchmarks data sets demonstrate the significantly better performance of the proposed MIFNet compared with most existing state-of-the-art methods.

Multiscale Fusion Convolutional Network in Real-time Semantic Segmentation

Deep Dual-Stream Network with Scale Context Selection Attention Module for Semantic Segmentation

Enhanced Multi-Scale Feature Adaptive Fusion Sparse Convolutional Network for Large-Scale Scenes Semantic Segmentation

MCFNet: Multi-scale Covariance Feature Fusion Network for Real-time Semantic Segmentation

MSCFNet: A Lightweight Network with Multi-Scale Context Fusion for Real-Time Semantic Segmentation

Real-Time Semantic Segmentation via Multiply Spatial Fusion Network

A Crossmodal Multiscale Fusion Network for Semantic Segmentation of Remote Sensing Data

MFAFNet: A Lightweight and Efficient Network with Multi-Level Feature Adaptive Fusion for Real-Time Semantic Segmentation

MAFNet: dual-branch fusion network with multiscale atrous pyramid pooling aggregate contextual features for real-time semantic segmentation

CIMFNet: Cross-layer Interaction and Multiscale Fusion Network for Semantic Segmentation of High-Resolution Remote Sensing Images

DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network

Multi-Scale Fusion With Matching Attention Model: A Novel Decoding Network Cooperated With NAS for Real-Time Semantic Segmentation

Multi-Scale Cross-Attention Fusion Network Based on Image Super-Resolution

MLFNet: Multi-Level Fusion Network for Real-Time Semantic Segmentation of Autonomous Driving

Encoder- and Decoder-Based Networks Using Multiscale Feature Fusion and Nonlocal Block for Remote Sensing Image Semantic Segmentation

Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes

ZMNet: feature fusion and semantic boundary supervision for real-time semantic segmentation

MCAFNet: A Multiscale Channel Attention Fusion Network for Semantic Segmentation of Remote Sensing Images

MFEAFN: Multi-scale feature enhanced adaptive fusion network for image semantic segmentation

MIFNet: A Lightweight Multiscale Information Fusion Network

An Attention-Fused Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery