Abstract:Monocular 3D vehicle localization is an important task in Intelligent Transportation System (ITS) and Cooperative Vehicle Infrastructure System (CVIS), which is usually achieved by monocular 3D vehicle detection. However, depth information cannot be obtained directly by monocular cameras due to the inherent imaging mechanism, resulting in more challenging monocular 3D tasks. Most of the current monocular 3D vehicle detection methods leverage 2D detectors and additional geometric modules, which reduces the efficiency. In this paper, we propose a 3D vehicle localization network CenterLoc3D for roadside monocular cameras, which directly predicts centroid and eight vertexes in image space, and the dimension of 3D bounding boxes without 2D detectors. To improve the precision of 3D vehicle localization, we propose a weighted-fusion module and a loss with spatial constraints embedded in CenterLoc3D. Firstly, the transformation matrix between 2D image space and 3D world space is solved by camera calibration. Secondly, vehicle type, centroid, eight vertexes, and the dimension of 3D vehicle bounding boxes are obtained by CenterLoc3D. Finally, centroid in 3D world space can be obtained by camera calibration and CenterLoc3D for 3D vehicle localization. To the best of our knowledge, this is the first application of 3D vehicle localization for roadside monocular cameras. Hence, we also propose a benchmark for this application including a dataset (SVLD-3D), an annotation tool (LabelImg-3D), and evaluation metrics. Through experimental validation, the proposed method achieves high accuracy and real-time performance. (limited words, please see the article for more details)

Vehicle 3d Localization in Road Scenes VIA a Monocular Moving Camera

Monocular Visual Object 3D Localization in Road Scenes

CenterLoc3D: Monocular 3D Vehicle Localization Network for Roadside Surveillance Cameras

An Efficient Vehicle Localization Method by Using Monocular Vision

A Multi-view 3D Vehicle Detection Method Based On Novel 3D Proposal Generation Method

Monocular Vehicle Self-localization Method Based on Compact Semantic Map

Joint Monocular 3D Vehicle Detection and Tracking

The Earth ain't Flat: Monocular Reconstruction of Vehicles on Steep and Graded Roads from a Moving Camera

Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image

3D Vehicle Detection Using Cheap LiDAR and Camera Sensors.

TM3Loc: Tightly-Coupled Monocular Map Matching for High Precision Vehicle Localization

Monocular 3-D Vehicle Detection Using a Cascade Network for Autonomous Driving

Off-road Localization Using Monocular Camera and Nodding LiDAR

Real-time Monocular 3D People Localization and Tracking on Embedded System

Image Guidance Based 3D Vehicle Detection in Traffic Scene.

Multi-Stage CNN-Based Monocular 3D Vehicle Localization and Orientation Estimation

Joint 3-D Shape Estimation and Landmark Localization from Monocular Cameras of Intelligent Vehicles

Multiple-Kernel Based Vehicle Tracking Using 3D Deformable Model and Camera Self-Calibration

Ground-aware Monocular 3D Object Detection for Autonomous Driving

Accurate Monocular Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving

Monocular 3D object detection via estimation of paired keypoints for autonomous driving