ReINView: Re-interpreting Views for Multi-view 3D Object Recognition

Ruchang Xu,Wei Ma,Qing Mi,Hongbin Zha
DOI: https://doi.org/10.1109/iros47612.2022.9981777
2022-01-01
Abstract:Multi-view-based 3D object recognition is important in robot-environment interaction. However, recent methods simply extract features from each view via convolutional neural networks (CNNs) and then fuse these features together to make predictions. These methods ignore the inherent ambiguities of each view caused due to 3D-2D projection. To address this problem, we propose a novel deep framework for multi-view-based 3D object recognition. Instead of fusing the multi-view features directly, we design a re-interpretation module (ReINView) to eliminate the ambiguities at each view. To achieve this, ReINView re-interprets view features patch by patch by using their context from nearby views, considering that local patches are generally co-visible at nearby viewpoints. Since contour shapes are essential for 3D object recognition as well, ReINView further performs view-level re-interpretation, in which we use all the views as context sources since the target contours to be re-interpreted are globally observable. The re-interpreted multi-view features can better reflect the 3D global and local structures of the object. Experiments on both ModelNet40 and ModelNet10 show that the proposed model outperforms state-of-the-art methods in 3D object recognition.
What problem does this paper attempt to address?