Abstract:Concurrent perception datasets for autonomous driving are mainly limited to frontal view with sensors mounted on the vehicle. None of them is designed for the overlooked roadside perception tasks. On the other hand, the data captured from roadside cameras have strengths over frontal-view data, which is believed to facilitate a safer and more intelligent autonomous driving system. To accelerate the progress of roadside perception, we present the first high-diversity challenging Roadside Perception 3D dataset- Rope3D from a novel view. The dataset consists of 50k images and over 1.5M 3D objects in various scenes, which are captured under different settings including various cameras with ambiguous mounting positions, camera specifications, viewpoints, and different environmental conditions. We conduct strict 2D-3D joint annotation and comprehensive data analysis, as well as set up a new 3D roadside perception benchmark with metrics and evaluation devkit. Furthermore, we tailor the existing frontal-view monocular 3D object detection approaches and propose to leverage the geometry constraint to solve the inherent ambiguities caused by various sensors, viewpoints. Our dataset is available on <a class="link-external link-https" href="https://thudair.baai.ac.cn/rope" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that the existing autonomous driving perception datasets are mainly limited to the front - view of the vehicle, while ignoring the tasks from the roadside view. The data from the roadside view has stronger robustness and longer - term event prediction ability, which can make up for the blind spots and limitations of the vehicle - view, thereby achieving a safer and more intelligent autonomous driving system. However, at present, there is no 3D perception dataset specifically designed for the roadside view, especially for the monocular 3D object detection task. Specifically, the paper aims to: 1. **Fill the gap in the 3D perception dataset from the roadside view**: Most of the existing autonomous driving datasets are based on the front - view of the vehicle and cannot fully utilize the advantages of roadside cameras. The paper proposes the first highly diverse and challenging roadside perception 3D dataset - Rope3D, to promote 3D perception research from the roadside view. 2. **Solve the monocular 3D detection problems from the roadside view**: Due to the diversity of installation positions, angles, environments, etc. of roadside cameras, the monocular 3D detection task becomes more complicated. The paper solves these inherent ambiguities and challenges by introducing geometric constraints and other methods. 3. **Establish a new evaluation benchmark**: In order to better evaluate the 3D perception performance from the roadside view, the paper designs new evaluation metrics and provides an evaluation toolkit (devkit) so that researchers can evaluate the performance of models more comprehensively. 4. **Promote the safety and intelligence of autonomous driving systems**: By using data from the roadside view, autonomous driving systems can perceive the surrounding environment more comprehensively, reduce blind spots, and improve driving safety and the ability of intelligent traffic management. In summary, the main goal of this paper is to promote monocular 3D object detection research from the roadside view by constructing and releasing the Rope3D dataset, thereby providing support for safer and more intelligent autonomous driving systems.

Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task

RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception

RopeBEV: A Multi-Camera Roadside Perception Network in Bird's-Eye-View

RCooper: A Real-world Large-scale Dataset for Roadside Cooperative Perception

RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments

CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks

OpenMPD: An Open Multimodal Perception Dataset for Autonomous Driving

Road and Railway Smart Mobility: A High-Definition Ground Truth Hybrid Dataset

RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving

TUMTraf V2X Cooperative Perception Dataset

ROAD: The ROad event Awareness Dataset for Autonomous Driving

Towards Robust Robot 3D Perception in Urban Environments: The UT Campus Object Dataset

Towards Scenario Generalization for Vision-based Roadside 3D Object Detection

R4D: Utilizing Reference Objects for Long-Range Distance Estimation

SGV3D:Towards Scenario Generalization for Vision-based Roadside 3D Object Detection

Roadside Monocular 3D Detection via 2D Detection Prompting

Roadside HD Map Object Reconstruction Using Monocular Camera

HeightFormer: A Semantic Alignment Monocular 3D Object Detection Method from Roadside Perspective

3D Object Detection for Autonomous Driving: A Survey

Scalability in Perception for Autonomous Driving: Waymo Open Dataset.

DOLPHINS: Dataset for Collaborative Perception enabled Harmonious and Interconnected Self-driving