SliceNet: A Proficient Model for Real-Time 3D Shape-Based Recognition

Xuzhan Chen,Youping Chen,Kashish Gupta,Jie Zhou,Homayoun Najjaran
DOI: https://doi.org/10.1016/j.neucom.2018.07.061
IF: 6
2018-01-01
Neurocomputing
Abstract:The field of 3D object recognition has been dominated by 2D view-based methods mostly because of lower accuracy and larger computational load of 3D shape-based methods. Recognition with a 3D shape yields appreciable advantages e.g., making use of depth information and independence to ambient lighting, but we are still away from an eminent solution for 3D shape-based object recognition. In this paper first, a statistical method capable of modeling the input and output with random variables is used to investigate the reasons contributing to the inferior performance of the 3D convolution operation. The analysis suggests that the excessive size of the kernel causes the dramatic blowing up of the output variance of the 3D convolution operation and makes the output feature less discriminating. Then, based on the results of this analysis and inspired by the underlying principle of 3D shapes, SliceNet is proposed to learn 3D shape features using anisotropic 3D convolution. Specifically, the proposed method learns features from original 2D planar sketches comprising the 3D shape and has a significantly lower output variance. Experiments on ModelNet show that the recognition accuracy of the proposed SliceNet is comparable to well-established 2D view-based methods. Besides, the SliceNet also has a significantly smaller model size, simpler architecture, less training and inference time compared to 2D view-based and other 3D object recognition methods. An experiment with real-world data shows that the model trained on CAD files can be generalized to real-world objects without any re-training or fine-tuning.
What problem does this paper attempt to address?