Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task

Xiaoqing Ye,Mao Shu,Hanyu Li,Yifeng Shi,Yingying Li,Guangjie Wang,Xiao Tan,Errui Ding
DOI: https://doi.org/10.48550/arXiv.2203.13608
2022-03-25
Abstract:Concurrent perception datasets for autonomous driving are mainly limited to frontal view with sensors mounted on the vehicle. None of them is designed for the overlooked roadside perception tasks. On the other hand, the data captured from roadside cameras have strengths over frontal-view data, which is believed to facilitate a safer and more intelligent autonomous driving system. To accelerate the progress of roadside perception, we present the first high-diversity challenging Roadside Perception 3D dataset- Rope3D from a novel view. The dataset consists of 50k images and over 1.5M 3D objects in various scenes, which are captured under different settings including various cameras with ambiguous mounting positions, camera specifications, viewpoints, and different environmental conditions. We conduct strict 2D-3D joint annotation and comprehensive data analysis, as well as set up a new 3D roadside perception benchmark with metrics and evaluation devkit. Furthermore, we tailor the existing frontal-view monocular 3D object detection approaches and propose to leverage the geometry constraint to solve the inherent ambiguities caused by various sensors, viewpoints. Our dataset is available on <a class="link-external link-https" href="https://thudair.baai.ac.cn/rope" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that the existing autonomous driving perception datasets are mainly limited to the front - view of the vehicle, while ignoring the tasks from the roadside view. The data from the roadside view has stronger robustness and longer - term event prediction ability, which can make up for the blind spots and limitations of the vehicle - view, thereby achieving a safer and more intelligent autonomous driving system. However, at present, there is no 3D perception dataset specifically designed for the roadside view, especially for the monocular 3D object detection task. Specifically, the paper aims to: 1. **Fill the gap in the 3D perception dataset from the roadside view**: Most of the existing autonomous driving datasets are based on the front - view of the vehicle and cannot fully utilize the advantages of roadside cameras. The paper proposes the first highly diverse and challenging roadside perception 3D dataset - Rope3D, to promote 3D perception research from the roadside view. 2. **Solve the monocular 3D detection problems from the roadside view**: Due to the diversity of installation positions, angles, environments, etc. of roadside cameras, the monocular 3D detection task becomes more complicated. The paper solves these inherent ambiguities and challenges by introducing geometric constraints and other methods. 3. **Establish a new evaluation benchmark**: In order to better evaluate the 3D perception performance from the roadside view, the paper designs new evaluation metrics and provides an evaluation toolkit (devkit) so that researchers can evaluate the performance of models more comprehensively. 4. **Promote the safety and intelligence of autonomous driving systems**: By using data from the roadside view, autonomous driving systems can perceive the surrounding environment more comprehensively, reduce blind spots, and improve driving safety and the ability of intelligent traffic management. In summary, the main goal of this paper is to promote monocular 3D object detection research from the roadside view by constructing and releasing the Rope3D dataset, thereby providing support for safer and more intelligent autonomous driving systems.