Abstract:With the rapid development of artificial intelligence and computer vision technology, autonomous driving technology has become a hot area of concern. The driving scenarios of autonomous vehicles can be divided into structured scenarios and unstructured scenarios. Compared with structured scenes, unstructured road scenes lack the constraints of lane lines and traffic rules, and the safety awareness of traffic participants is weaker. Therefore, there are new and higher requirements for the environment perception tasks of autonomous vehicles in unstructured road scenes. The current research rarely integrates the target detection and road segmentation to achieve the simultaneous processing of target detection and road segmentation of autonomous vehicle in unstructured road scenes. Aiming at the above issues, a multitask model for object detection and road segmentation in unstructured road scenes is proposed. Through the sharing and fusion of the object detection model and road segmentation model, multitask model can complete the tasks of multi-object detection and road segmentation in unstructured road scenes while inputting a picture. Firstly, MobileNetV2 is used to replace the backbone network of YOLOv5, and multi-scale feature fusion is used to realize the information exchange layer between different features. Subsequently, a road segmentation model was designed based on the DeepLabV3+ algorithm. Its main feature is that it uses MobileNetV2 as the backbone network and combines the binary classification focus Loss function for network optimization. Then, we fused the object detection algorithm and road segmentation algorithm based on the shared MobileNetV2 network to obtain a multitask model and trained it on both the public dataset and the self-built dataset NJFU. The training results demonstrate that the multitask model significantly enhances the algorithm's execution speed by approximately 10 FPS while maintaining the accuracy of object detection and road segmentation. Finally, we conducted validation of the multitask model on an actual vehicle.

Multi-Task Deep Learning Model for Autonomous Driving: Object Detection, Semantic Segmentation, and Depth Estimation

A Depth Estimation Framework Based on Unsupervised Learning and Cross-Modal Translation

Semantic Segmentation and Depth Estimation of Urban Road Scene Images Using Multi-Task Networks

A Joint Object Detection and Semantic Segmentation Model with Cross-Attention and Inner-Attention Mechanisms

Multi-Modal Sensor Fusion-Based Deep Neural Network for End-to-End Autonomous Driving With Scene Understanding

Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator

LiDAR-BEVMTN: Real-Time LiDAR Bird's-Eye View Multi-Task Perception Network for Autonomous Driving

3D car-detection based on a Mobile Deep Sensor Fusion Model and real-scene applications

A Multi-Task Network Based on Dual-Neck Structure for Autonomous Driving Perception

Joint Semantic Understanding with a Multilevel Branch for Driving Perception

End-to-End Autonomous Driving With Semantic Depth Cloud Mapping and Multi-Agent

Towards Compact Autonomous Driving Perception With Balanced Learning and Multi-Sensor Fusion

Research on multitask model of object detection and road segmentation in unstructured road scenes

Multi-Task Learning in Autonomous Driving Scenarios Via Adaptive Feature Refinement Networks

Research on Road Scene Understanding of Autonomous Vehicles Based on Multi-Task Learning

Enhanced encoder–decoder architecture for visual perception multitasking of autonomous driving

Human Insights Driven Latent Space for Different Driving Perspectives: A Unified Encoder for Efficient Multi-Task Inference

MultiNet: Multi-Modal Multi-Task Learning for Autonomous Driving

A Camera-Based End-to-End Autonomous Driving Framework Combined with Meta-Based Multi-task Optimization