A Multi-Task Network Based on Dual-Neck Structure for Autonomous Driving Perception

Guopeng Tan,Chao Wang,Zhihua Li,Yuanbiao Zhang,Ruikai Li
DOI: https://doi.org/10.3390/s24051547
IF: 3.9
2024-02-29
Sensors
Abstract:A vision-based autonomous driving perception system necessitates the accomplishment of a suite of tasks, including vehicle detection, drivable area segmentation, and lane line segmentation. In light of the limited computational resources available, multi-task learning has emerged as the preeminent methodology for crafting such systems. In this article, we introduce a highly efficient end-to-end multi-task learning model that showcases promising performance on all fronts. Our approach entails the development of a reliable feature extraction network by introducing a feature extraction module called C2SPD. Moreover, to account for the disparities among various tasks, we propose a dual-neck architecture. Finally, we present an optimized design for the decoders of each task. Our model evinces strong performance on the demanding BDD100K dataset, attaining remarkable accuracy (Acc) in vehicle detection and superior precision in drivable area segmentation (mIoU). In addition, this is the first work that can process these three visual perception tasks simultaneously in real time on an embedded device Atlas 200I A2 and maintain excellent accuracy.
engineering, electrical & electronic,chemistry, analytical,instruments & instrumentation
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to design an efficient, real - time and accurate multi - task learning network for the three key tasks of vehicle detection, drivable area segmentation and lane line segmentation in the autonomous driving perception system under the condition of limited resources. Specifically, the paper proposes a multi - task network YOLOP - DN based on the dual - neck structure, aiming to overcome the deficiencies of existing methods in detection accuracy and segmentation accuracy while maintaining low computational resource consumption and high real - time performance. By introducing the efficient feature extraction module C2SPD and the optimized decoder design, this network can achieve real - time processing on embedded devices and has achieved significant performance improvement on the BDD100K dataset. ### Main contributions: 1. **Proposed a new feature extraction backbone network**: By introducing the C2SPD module, the feature extraction ability is improved while the number of parameters and the inference speed are controlled. 2. **Designed a dual - neck structure**: This structure can meet the different feature requirements of detection and segmentation tasks respectively, avoiding the problem of insufficient information complementarity caused by a single - neck structure. 3. **Developed an end - to - end multi - task learning network YOLOP - DN**: This network performs well on the BDD100K dataset and has been successfully deployed on the Atlas 200I A2 embedded device to achieve real - time inference. ### Experimental results: - **Vehicle detection**: YOLOP - DN reaches 78.1% on the mAP50 index, which is 1.6% higher than the baseline model; the recall rate reaches 90.5%, which is 1.3% higher than the baseline model. - **Drivable area segmentation**: YOLOP - DN reaches 92.0% on the mIoU index, which is 0.5% higher than the baseline model. - **Lane line segmentation**: YOLOP - DN reaches 73.8% and 27.3% on the accuracy and IoU indexes respectively, which are 3.3% and 1.1% higher than the baseline model respectively. ### Model parameters and inference speed: - The number of parameters of YOLOP - DN is 10.9M, and the inference speed is 91fps. Compared with the baseline model YOLOP (7.9M parameters, 125fps) and the state - of - the - art YOLOPv2 (38.9M parameters, 168fps), YOLOP - DN has achieved a good balance between the number of parameters and the inference speed. ### Conclusion: Through the above improvements, YOLOP - DN has excellent performance in network performance, computational resource consumption and real - time performance, providing strong support for the practical application of the autonomous driving perception system.