Abstract:The rapid development and commercialization of unmanned aerial vehicle (UAV) technology has made it possible to conduct urban traffic information extraction using UAV images. However, the large variations of targets in urban environments, complex foregrounds and backgrounds in cities, and severe tree and shadow occlusions pose great challenges in car and road extraction using UAV images. In this study, we propose a lightweight, efficient dual contextual parsing network (EDCPNet) to address the above issues. The proposed efficient dual contextual parsing (EDCP) module in EDCPNet is mainly composed of spatial contextual parsing (SCP) and channel contextual parsing (CCP), which can effectively acquire rich contextual features in both spatial and channel dimensions, adaptively recalibrate the attention weights, perceive the salient features of targets in images, and suppress the importance of irrelevant elements. It, thus, leads to improved performance and adaptability that facilitate the practical applications of large-scale urban traffic monitoring in complex urban scenes. We conduct experiments on two benchmark datasets [UAV image dataset (UAVid) and urban drone dataset (UDD)] by comparing the proposed EDCPNet with six other competing methods, i.e., U-Net, PSPNet, Deelabv3+, SegNet, ESNet, and ERFNet, and validate the effectiveness of the proposed EDCP module via extensive ablation studies. The results suggest that the proposed network outperforms all competing methods in car and road extraction from UAV images with a balanced computational cost. Its great performance and low computational demand (with only 2.37M model parameters) facilitate its deployment on edge computing devices with memory constraints.

ECNet: an Efficient and Context-Aware Network for Street Scene Parsing.

Efficient Light Deep Network for Street Scene Parsing.

EKENet: Efficient knowledge enhanced network for real-time scene parsing

Parsing Very High Resolution Urban Scene Images by Learning Deep ConvNets with Edge-Aware Loss

EFRNet: Efficient Feature Reconstructing Network for Real-Time Scene Parsing

Adaptive Context Network for Scene Parsing

Improve SegNet with Feature Pyramid for Road Scene Parsing

EPRNet: Efficient Pyramid Representation Network for Real-Time Street Scene Segmentation

Context-Integrated and Feature-Refined Network for Lightweight Object Parsing

Toward Achieving Robust Low-Level and High-Level Scene Parsing

OCNet: Object Context Network for Scene Parsing

EANET: Efficient Attention-Augmented Network for Real-Time Semantic Segmentation.

RAPNet: Residual Atrous Pyramid Network for Importance-Aware Street Scene Parsing

Lightweight cross-guided contextual perceptive network for visible–infrared urban road scene parsing

DSANet: Dilated Spatial Attention for Real-Time Semantic Segmentation in Urban Street Scenes.

EADNet: Efficient Asymmetric Dilated Network for Semantic Segmentation

Denseaspp For Semantic Segmentation In Street Scenes

Fast Semantic Segmentation for Scene Perception

ELANet: an efficiently lightweight asymmetrical network for real-time semantic segmentation

High Resolution Scene Parsing Network Based on Semantic Segmentation

Road and Car Extraction Using UAV Images Via Efficient Dual Contextual Parsing Network