A Comprehensive Review of 3D Object Detection in Autonomous Driving: Technological Advances and Future Directions

Yu Wang,Shaohua Wang,Yicheng Li,Mingchun Liu
2024-08-28
Abstract:In recent years, 3D object perception has become a crucial component in the development of autonomous driving systems, providing essential environmental awareness. However, as perception tasks in autonomous driving evolve, their variants have increased, leading to diverse insights from industry and academia. Currently, there is a lack of comprehensive surveys that collect and summarize these perception tasks and their developments from a broader perspective. This review extensively summarizes traditional 3D object detection methods, focusing on camera-based, LiDAR-based, and fusion detection techniques. We provide a comprehensive analysis of the strengths and limitations of each approach, highlighting advancements in accuracy and robustness. Furthermore, we discuss future directions, including methods to improve accuracy such as temporal perception, occupancy grids, and end-to-end learning frameworks. We also explore cooperative perception methods that extend the perception range through collaborative communication. By providing a holistic view of the current state and future developments in 3D object perception, we aim to offer a more comprehensive understanding of perception tasks for autonomous driving. Additionally, we have established an active repository to provide continuous updates on the latest advancements in this field, accessible at: <a class="link-external link-https" href="https://github.com/Fishsoup0/Autonomous-Driving-Perception" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is a comprehensive review of 3D object detection in the field of autonomous driving and future development directions. Specifically, the paper aims to: 1. **Provide a comprehensive analysis**: Conduct a comprehensive analysis of 3D object detection methods based on cameras, LiDAR (Light Detection and Ranging), and fusion techniques, summarize their advantages and disadvantages, and highlight the progress in terms of accuracy and robustness. 2. **Fill the existing gaps**: Currently, there is a lack of comprehensive surveys that collect and summarize these perception tasks and their development from a broader perspective. This paper fills this gap by comprehensively reviewing traditional 3D object detection methods. 3. **Explore future directions**: Discuss future improvement directions, including time - perception, occupancy grid, end - to - end learning frameworks, etc., to improve accuracy. In addition, it also explores cooperative perception methods that expand the perception range through collaborative communication. 4. **Provide a panoramic view**: Not only summarize perception methods, but also compile datasets and evaluation metrics used by different methods to promote research insights. ### Specific problem description In the field of autonomous driving, 3D object detection is one of the key technologies for achieving accurate environmental perception. It involves using various sensors (such as cameras, LiDAR, and multi - sensor fusion) to capture environmental data and applying algorithms to identify different objects. Specific tasks include: - **3D object detection**: Identify and locate various objects (such as vehicles, pedestrians, and obstacles) in sensor data, and accurately determine their positions, sizes, directions, and categories. \[ B = [x, y, z, l, w, h, \theta, \text{class}] \] - **Time - perception**: Process continuous sensor data streams to achieve real - time object detection and tracking. - **3D occupancy grid**: Understand the state of each voxel in 3D space, estimate its occupancy state and semantic label, thereby enhancing the accuracy of path planning and collision detection. - **End - to - end autonomous driving**: Directly use sensor data as input and generate driving decisions as output, simplifying the system architecture but requiring highly optimized overall performance. - **Cooperative perception**: Through information sharing between vehicles (V2V) or between vehicles and infrastructure (V2I), enhance the perception ability of a single vehicle and improve the perception accuracy and robustness in complex environments. ### Summary The main contribution of this paper lies in providing a comprehensive review and classification analysis of 3D object perception technologies in the autonomous driving environment, covering the latest camera, LiDAR, and fusion detection methods. At the same time, it also proposes future research directions and provides valuable dataset and evaluation metric resources for researchers.