PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness

Anh-Quan Cao,Angela Dai,Raoul de Charette
2024-05-25
Abstract:We propose the task of Panoptic Scene Completion (PSC) which extends the recently popular Semantic Scene Completion (SSC) task with instance-level information to produce a richer understanding of the 3D scene. Our PSC proposal utilizes a hybrid mask-based technique on the non-empty voxels from sparse multi-scale completions. Whereas the SSC literature overlooks uncertainty which is critical for robotics applications, we instead propose an efficient ensembling to estimate both voxel-wise and instance-wise uncertainties along PSC. This is achieved by building on a multi-input multi-output (MIMO) strategy, while improving performance and yielding better uncertainty for little additional compute. Additionally, we introduce a technique to aggregate permutation-invariant mask predictions. Our experiments demonstrate that our method surpasses all baselines in both Panoptic Scene Completion and uncertainty estimation on three large-scale autonomous driving datasets. Our code and data are available at <a class="link-external link-https" href="https://astra-vision.github.io/PaSCo" rel="external noopener nofollow">this https URL</a> .
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to predict the panoptic scene completion (PSC) of a scene from incomplete 3D point cloud inputs while providing pixel - level and instance - level uncertainty estimates. Specifically, although existing semantic scene completion (SSC) methods have made significant progress in predicting the complete geometric structure and semantics of a scene, they ignore instance - level information and uncertainty prediction, which limits the practicality of these techniques in applications that require the identification and tracking of individual objects and their deployment in real - world applications with high safety requirements. To address these issues, the paper proposes a new task - panoptic scene completion (PSC), whose goal is to comprehensively predict the geometric structure, semantics, and instance information of a scene from sparse observation data. To this end, the authors introduce a new method, PaSCo, which utilizes an architecture that combines a multi - scale generative sparse network with a transformer decoder, predicts instances through a mask - based strategy, and improves PSC performance and uncertainty estimation through a multi - input multi - output (MIMO) strategy while maintaining a low computational cost. In addition, PaSCo also introduces a novel unordered set aggregation technique for combining multiple mask prediction results, thereby outperforming all baseline methods on three large - scale autonomous driving datasets, not only performing excellently on the PSC task but also providing valuable insights into prediction uncertainty.