Deep Learning-Enhanced Environment Perception for Autonomous Driving: MDNet with CSP-DarkNet53

Xuyao Guo,Feng Jiang,Quanzhen Chen,Yuxuan Wang,Kaiyue Sha,Jing Chen
DOI: https://doi.org/10.1016/j.patcog.2024.111174
IF: 8
2024-01-01
Pattern Recognition
Abstract:Implementing environmental perception in intelligent vehicles is a crucial application, but the parallel processing of numerous algorithms on the vehicle side is complex, and their integration remains a critical challenge. To address this problem, this paper proposes a multitask detection algorithm Multitask Detection Network (MDNet) based on Cross Stage Partial Networks with Darknet53 Backbone (CSP-DarkNet53) with high feature extraction capability, which can simultaneously detect vehicles, pedestrians, traffic lights, traffic signs, and bicycles as well as lane lines. MDNet obtains exceptional results in multitask scenarios by employing innovative architectural designs consisting of a Feature Extraction Module, Target-level Branches, and Pixel-level Branches. The feature extraction module proposes an improved CSPPF structure to extract features more efficiently for three tasks, facilitating MDNet's capacity. The target-level branch suggests PFPN, which combines features from the backbone network, and the pixel-level branch utilizes a primary feature fusion network and an enhanced C2F_Faster method to spot lane lines more precisely. By incorporating these designs, MDNet's performance in complex environments is enhanced significantly. The algorithm underwent testing on the Berkeley DeepDrive 100K (BDD100K) and Cityscapes datasets, in which it could identify traffic targets and lane lines in numerous challenging settings, resulting in a 9.8% measure of improvement in detection accuracy map for all three tasks relative to You Only Look Once for Panoptic Driving Perception (YOLOP, a multitask detection network), an 8.9% improvement in IoU, a 22.1% improvement in accuracy. It reached a speed of 46fps, which serves the practical applications' requirements more effectively.
What problem does this paper attempt to address?