Deep Learning Based 3D Segmentation: A Survey

Yong He,Hongshan Yu,Xiaoyan Liu,Zhengeng Yang,Wei Sun,Saeed Anwar,Ajmal Mian
2024-09-17
Abstract:3D segmentation is a fundamental and challenging problem in computer vision with applications in autonomous driving and robotics. It has received significant attention from the computer vision, graphics and machine learning communities. Conventional methods for 3D segmentation, based on hand-crafted features and machine learning classifiers, lack generalization ability. Driven by their success in 2D computer vision, deep learning techniques have recently become the tool of choice for 3D segmentation tasks. This has led to an influx of many methods in the literature that have been evaluated on different benchmark datasets. Whereas survey papers on RGB-D and point cloud segmentation exist, there is a lack of a recent in-depth survey that covers all 3D data modalities and application domains. This paper fills the gap and comprehensively surveys the recent progress in deep learning-based 3D segmentation techniques. We cover over 220 works from the last six years, analyze their strengths and limitations, and discuss their competitive results on benchmark datasets. The survey provides a summary of the most commonly used pipelines and finally highlights promising research directions for the future.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem this paper attempts to address is the challenge of 3D segmentation in computer vision. Specifically, 3D segmentation is a fundamental and challenging problem with applications in fields such as autonomous driving, robotics, industrial control, and augmented reality. Traditional methods based on handcrafted features and machine learning classifiers lack generalization capabilities. In recent years, due to the success in 2D computer vision, deep learning techniques have become the preferred tool for 3D segmentation tasks. However, existing review articles mainly focus on RGB-D and point cloud segmentation, lacking comprehensive coverage of all 3D data modalities and application areas. Therefore, this paper fills this gap by providing a comprehensive review of the latest advancements in deep learning-based 3D segmentation techniques. The main contributions of the paper include: 1. This is the first comprehensive review paper covering deep learning-based 3D segmentation methods in computer vision, encompassing the most common 3D data representations, including RGB-D, projected images, voxels, point clouds, meshes, and 3D videos. 2. The review provides an in-depth analysis of the relative advantages and disadvantages of different 3D data segmentation methods. Unlike existing reviews, this paper focuses on deep learning methods specifically designed for 3D segmentation and discusses typical segmentation pipelines. 3. Finally, the review offers a comprehensive comparison of existing methods on multiple public benchmark 3D datasets, drawing interesting conclusions and pointing out promising directions for future research. Through these contributions, the paper aims to provide researchers and practitioners with a comprehensive guide to better understand and apply deep learning-based 3D segmentation techniques.