Robust Dual-Modal Image Quality Assessment Aware Deep Learning Network for Traffic Targets Detection of Autonomous Vehicles.

Geng Keke,Dong Ge,Huang Wenhan
DOI: https://doi.org/10.1007/s11042-022-11924-1
IF: 2.577
2022-01-01
Multimedia Tools and Applications
Abstract:The multi-spectral image pairs composed of LIDAR and RGB images demonstrates more effective detection performance in complex traffic environments, such as low illumination, motion blur and strong noise, etc. However, there is still a lack of relevant research on how to better fuse the two modalities to improve the robustness and detection accuracy of the perception system for autonomous vehicles under the condition of low visible light image quality. In this paper, we proposed a dual-modal image quality aware deep neural network (DMIQADNN). We comprehensively compared and analyzed the adaptions of fusion architectures in the early, middle, late, and score stages. By comprehensively considering the detection accuracy and detection speed, the fusion architecture in the middle stage was selected. Besides, we developed an image quality assessment network (IQAN) to evaluate the image quality score for RGB images. The corresponding fusion weights for RGB sub-network and LIDAR sub-network were adaptively assigned by using the proposed fusion weight assignment function. Then based on the calculated fusion weights, the RGB and LIDAR sub-networks were adaptively merged via a data fusion sub-network. The RGB images in the KITTI dataset were processed by reducing illumination and adding motion blur and Gaussian noises to produce a modified dataset containing 7481 RBG-LIDAR image pairs, and the DNIQADNN was trained and tested by semi-automatic annotation. The experimental results on modified KITTI Benchmark and dataset collected by using our own developed autonomous vehicle validate the robustness and effectiveness of proposed method. The ultimate FPS and AP values of the DNIQADNN reach 27 and 39.1, which are superior to those of the state-of-the-art instance segmentation networks.
What problem does this paper attempt to address?