Abstract:Urban road extraction is important for the applications of urban planning and transportation. High-resolution image (HRI) has been one of the most popular data sources for extracting roads with high efficiency and low cost. However, roads in HRI are easily obscured by buildings, trees, and other landscapes, resulting in discontinuity of the extracted roads. While current road extraction techniques by multimodal data fusion have shown improved results compared to single-modal methods by incorporating additional information, most existing fusion methods fail to fully exploit the features from different modalities and consider prior knowledge of roads. To address the above problems, a dual encoder-based cross-modal complementary fusion network (DECCFNet) is proposed in this article. The proposed network takes full advantage of the rich feature information contained in HRI and the immunity of LiDAR data to the influence of shadows. By effectively fusing the complementary information from HRI and LiDAR data, DECCFNet, respectively, achieved an improvement by at least 2.94% and 2.8% in IOU compared to those only using a single data modality on the two datasets. The proposed DECCFNet mainly contains two modules: 1) cross-modal feature fusion (CMFF) module: in the dual encoder part, CMFF is employed to fuse the deep features of different modalities from the channel and spatial dimension, while a multiscale fusion strategy is utilized to extract the contextual information; 2) multi-direction strip convolution (MDSC) module: since roads have the characteristics of narrowness and continuity, adopting classical convolution kernels directly on road features may introduce irrelevant pixels into the computation, blurring the extraction results. To mitigate this issue, MDSC is applied to strip the convolution of road features from multiple directions based on square convolution and make the network focus more on the specific road features. By comparing several deep-learning multimodal data fusion networks in the two road datasets, the proposed network exhibits the best road extraction results.

Capitalizing on RGB-FIR Hybrid Imaging for Road Detection

Integrating Dense LiDAR-Camera Road Detection Maps by a Multi-Modal CRF Model

3-D LiDAR + Monocular Camera: an Inverse-Depth-Induced Fusion Framework for Urban Road Detection

A Cascaded LiDAR-Camera Fusion Network for Road Detection

Road Detection through CRF based LiDAR-Camera Fusion.

RGB Camera and LiDAR Fusion for Road Detection

Multi-Stage Residual Fusion Network for LIDAR-Camera Road Detection

RGB-LiDAR fusion for accurate 2D and 3D object detection

A Novel Approach for Detecting Road Based on Two-Stream Fusion Fully Convolutional Network

A Fusion Model for Road Detection based on Deep Learning and Fully Connected CRF

Fusion of LiDAR and Camera by Scanning in LiDAR Imagery and Image-Guided Diffusion for Urban Road Detection

Road Segmentation with Image-LiDAR Data Fusion

Two-View Fusion Based Convolutional Neural Network for Urban Road Detection

CSFuser: A Cascade Siamese Fusion Architecture for RGB-Infrared Object Detection

Deep Representation Learning for Road Detection Using Siamese Network.

LiDAR-Camera Fusion Based High-Resolution Network for Efficient Road Segmentation

A Deep Cross-Modal Fusion Network for Road Extraction With High-Resolution Imagery and LiDAR Data

FusionRCNN: LiDAR-Camera Fusion for Two-stage 3D Object Detection

MRD-YOLO: A Multispectral Object Detection Algorithm for Complex Road Scenes

Residual Channel Attention Fusion Network for Road Extraction Based on Remote Sensing Images and GPS Trajectories

DHA: Lidar and Vision Data Fusion-based on Road Object Classifier