Multi-scale Latent Feature-Aware Network for Logical Partition Based 3D Voxel Reconstruction

Caixia Liu,Dehui Kong,Shaofan Wang,Qianxing Li,Jinghua Li,Baocai Yin
DOI: https://doi.org/10.1016/j.neucom.2023.02.041
IF: 6
2023-01-01
Neurocomputing
Abstract:Although prior methods have achieved promising performance for recovering the 3D geometry from a single depth image, they tend to produce incomplete 3D shapes with noise. To this end, we propose Multi-Scale Latent Feature-Aware Network (MLANet) to recover the full 3D voxel grid from a single depth view of an object. MLANet logically represents a 3D voxel grid as visible voxels, occluded voxels and non-object voxels, and aims to the reconstruction of the latter two. Thus MLANet first introduces Multi-Scale Latent Feature-Aware (MLFA) based AutoEncoder (MLFA-AE) and a logical partition module to predict an occluded voxel grid ( OccVoxGd ) and a non-object voxel grid ( NonVoxGd ) from the visible voxel grid ( VisVoxGd ) corresponding to the input. MLANet then introduces MLFA based Generative Adversarial Network (MLFA-GAN) to refine the OccVoxGd and the NonVoxGd , and combines them with the VisVoxGd to generate a target 3D occupancy grid. MLFA shows a strong ability of learning multi-scale features of an object effectively and can be considered as a plug-and-play component to promote existing networks. The logical partition helps suppress NonVoxGd noise and improve OccVoxGd accuracy under adversarial constraints. Experimental studies on both synthetic and real-world data show that MLANet outperforms the state-of-the-art methods, and especially reconstructs unseen object categories with a higher accuracy.
What problem does this paper attempt to address?