Abstract:Diffusion models have been applied to 3D LiDAR scene completion due to their strong training stability and high completion quality. However, the slow sampling speed limits the practical application of diffusion-based scene completion models since autonomous vehicles require an efficient perception of surrounding environments. This paper proposes a novel distillation method tailored for 3D LiDAR scene completion models, dubbed $\textbf{ScoreLiDAR}$, which achieves efficient yet high-quality scene completion. ScoreLiDAR enables the distilled model to sample in significantly fewer steps after distillation. To improve completion quality, we also introduce a novel $\textbf{Structural Loss}$, which encourages the distilled model to capture the geometric structure of the 3D LiDAR scene. The loss contains a scene-wise term constraining the holistic structure and a point-wise term constraining the key landmark points and their relative configuration. Extensive experiments demonstrate that ScoreLiDAR significantly accelerates the completion time from 30.55 to 5.37 seconds per frame ($>$5$\times$) on SemanticKITTI and achieves superior performance compared to state-of-the-art 3D LiDAR scene completion models. Our code is publicly available at <a class="link-external link-https" href="https://github.com/happyw1nd/ScoreLiDAR" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to accelerate 3D LiDAR scene completion based on the diffusion model while maintaining high - quality generation results. Specifically, the authors propose a new method named ScoreLiDAR, which aims to reduce the sampling steps through the model distillation technique, thereby significantly improving the speed of scene completion, and ensuring the quality of the generation results by introducing Structural Loss. ### Background and Problem Description In applications such as autonomous driving, 3D LiDAR sensors can provide high - precision environmental perception, but the point cloud data they collect is usually sparse, especially in occluded areas. In order to provide a denser and more comprehensive scene representation, it is necessary to complete these sparse 3D LiDAR scenes. Although existing diffusion models perform well in training stability and generation quality, their slow sampling speed limits their efficiency in practical applications. Therefore, how to accelerate the sampling process of the diffusion model while ensuring the generation quality has become an urgent problem to be solved. ### Main Contributions of the Paper 1. **Propose ScoreLiDAR**: A new distillation method specifically for the 3D LiDAR scene completion task, which can achieve efficient and high - quality scene completion while significantly reducing the sampling steps. 2. **Introduce Structural Loss**: Capture the geometric structure information of 3D point clouds through scene - level loss and point - level loss to ensure that the student model can effectively learn complex geometric features. 3. **Experimental Verification**: Extensive experiments show that ScoreLiDAR not only significantly improves the sampling speed (more than 5 times), but also outperforms the existing state - of - the - art models in multiple metrics. ### Key Technologies of the Solution - **Variational Score Distillation (VSD)**: Use the pre - trained diffusion model to calculate the distribution matching loss to train the student model. - **Structural Loss**: Including scene - level loss and point - level loss, used to constrain the overall structure and key landmark points and their relative configurations. - **Optimization Process**: Alternately optimize the student model and the auxiliary diffusion model to ensure that the student model can effectively learn from the teacher model. ### Experimental Results The experimental results show that ScoreLiDAR has achieved excellent performance on both the SemanticKITTI and KITTI - 360 datasets. Compared with the state - of - the - art LiDiff model, ScoreLiDAR not only shortens the completion time from 30 seconds to about 5 seconds, but also has a significant improvement in evaluation metrics such as Chamfer Distance (CD) and Jensen - Shannon Divergence (JSD). In addition, the ablation experiment further verifies the effectiveness of the Structural Loss, proving its importance in improving the generation quality. In conclusion, this paper successfully solves the problem of slow sampling speed in the 3D LiDAR scene completion task by proposing the ScoreLiDAR method, providing a faster and more efficient solution for application scenarios such as autonomous driving.

Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion

Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion

Semantic-guided Depth Completion from Monocular Images and 4D Radar Data

DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network

Towards Realistic Scene Generation with LiDAR Diffusion Models

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models

Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion

MapDistill: Boosting Efficient Camera-based HD Map Construction via Camera-LiDAR Fusion Model Distillation

Fast LiDAR Upsampling using Conditional Diffusion Models

Semantic Segmentation-assisted Scene Completion for LiDAR Point Clouds

SC-Diff: 3D Shape Completion with Latent Diffusion Models

Sparse3D: Distilling Multiview-Consistent Diffusion for Object Reconstruction from Sparse Views

UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation

Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints

Knowledge Distillation from 3D to Bird's-Eye-View for LiDAR Semantic Segmentation

Voxel- and Bird's-Eye-View-Based Semantic Scene Completion for LiDAR Point Clouds

SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection

Self-driving Simulation Scene Reconstruction Using Self-Supervised Depth Completion.

Explore the LiDAR-Camera Dynamic Adjustment Fusion for 3D Object Detection

LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation