Hybrid RGB-D object recognition using Convolutional Neural Network and Fisher Vector

wei li,zhiguo cao,yang xiao,zhiwen fang
DOI: https://doi.org/10.1109/CAC.2015.7382553
2015-01-01
Abstract:With the recent emergence of low-cost depth sensors (e.g., Microsoft Kinect), RGB-D image can be captured more easily for object recognition. Compared to the existing RGB-based paradigm, the introduction of depth information indeed imports extra descriptive cues (e.g., surface geometry) for object characterization. In this paper, a novel hybrid RGB-D object categorization model is proposed. It is fruited simultaneously from two state-of-the-art image representation technologies: Convolutional Neural Network (CNN) and Fisher Vector (FV). Specifically, the objects are characterized by CNN in RGB domain. While, CNN is not applied to depth domain, due to the lack of sufficient samples for training. We propose to extract the corresponding depth representation via FV with the densely sampled HONV descriptors. The CNN and FV description are then fused to form the unified RGB-D object signature. SVM is employed for decision. The experiments on a large-scale RGB-D dataset demonstrate that, our hybrid RGB-D object recognition model outperforms the state-of-the-art approaches by large margins (at least 6.3%).
What problem does this paper attempt to address?