Abstract:Abstract For smart mobility, and autonomous vehicles (AV), it is necessary to have a very precise perception of the environment to guarantee reliable decision-making, and to be able to extend the results obtained for the road sector to other areas such as rail. To this end, we introduce a new single-stage monocular real-time 3D object detection convolutional neural network (CNN) based on YOLOv5, dedicated to smart mobility applications for both road and rail environments. To perform the 3D parameter regression, we replace YOLOv5’s anchor boxes with our hybrid anchor boxes. Our method is available in different model sizes such as YOLOv5: small, medium, and large. The new model that we propose is optimized for real-time embedded constraints (lightweight, speed, and accuracy) that takes advantage of the improvement brought by split attention (SA) convolutions called small split attention model (Small-SA). To validate our CNN model, we also introduce a new virtual dataset for both road and rail environments by leveraging the video game Grand Theft Auto V (GTAV). We provide extensive results of our different models on both KITTI and our own GTAV datasets. Through our results, we show that our method is the fastest available 3D object detection with accuracy results close to state-of-the-art methods on the KITTI road dataset. We further demonstrate that the pre-training process on our GTAV virtual dataset improves the accuracy on real datasets such as KITTI, thus allowing our method to obtain an even greater accuracy than state-of-the-art approaches with 16.16% 3D average precision on hard car detection with inference time of 11.1 ms/image on an RTX 3080 GPU.

Learning 2D to 3D Lifting for Object Detection in 3D for Autonomous Vehicles

3D Vehicle Detection Using Cheap LiDAR and Camera Sensors.

Monocular 3-D Vehicle Detection Using a Cascade Network for Autonomous Driving

A Multi-view 3D Vehicle Detection Method Based On Novel 3D Proposal Generation Method

Ground-aware Monocular 3D Object Detection for Autonomous Driving

Self-supervised 3D Object Detection from Monocular Pseudo-LiDAR

Accurate Monocular Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving

Kinematic 3D Object Detection in Monocular Video

Multi-view 3D Object Detection Network for Autonomous Driving

A survey on 3D object detection in real time for autonomous driving

3D Bounding Box Estimation for Autonomous Vehicles by Cascaded Geometric Constraints and Depurated 2D Detections Using 3D Results

Dynamic Depth Fusion and Transformation for Monocular 3D Object Detection.

Learning to Predict the 3D Layout of a Scene

Lightweight convolutional neural network for real-time 3D object detection in road and railway environments

AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

Monocular 3D object detection via estimation of paired keypoints for autonomous driving

MP-Mono: Monocular 3D Detection Using Multiple Priors for Autonomous Driving

Transfer Learning from Simulated to Real Scenes for Monocular 3D Object Detection

Stereo R-CNN based 3D Object Detection for Autonomous Driving

6DoF-3D: Efficient and accurate 3D object detection using six degrees-of-freedom for autonomous driving