SVHAN: Sequential View Based Hierarchical Attention Network for 3D Shape Recognition.
Yue Zhao,Weizhi Nie,An-An Liu,Zan Gao,Yuting Su
DOI: https://doi.org/10.1145/3474085.3475371
2021-01-01
Abstract:As an important field of multimedia, 3D shape recognition has attracted much research attention in recent years. A lot of deep learning models have been proposed for effective 3D shape representation. The view-based methods show the superiority due to the comprehensive exploration of the visual characteristics with the help of established 2D CNN architectures. Generally, the current approaches contain the following disadvantages: First, the most majority of methods lack the consideration for sequential information among the multiple views, which can provide descriptive characteristics for shape representation. Second, the incomprehensive exploration for the multi-view correlations directly affects the discrimination of shape descriptors. Finally, roughly aggregating multi-view features leads to the loss of descriptive information, which limits the shape representation effectiveness. To handle these issues, we propose a novel sequential view based hierarchical attention network (SVHAN) for 3D shape recognition. Specifically, we first divide the view sequence into several view blocks. Then, we introduce a novel hierarchical feature aggregation module (HFAM), which hierarchically exploits the view-level, block-level, and shapelevel features, the intra- and inter- view-block correlations are also captured to improve the discrimination of learned features. Subsequently, a novel selective fusion module (SFM) is designed for feature aggregation, considering the correlations between different levels and preserving effective information. Finally, discriminative and informative shape descriptors are generated for the recognition task. We validate the effectiveness of our proposed method on two public databases. The experimental results show the superiority of SVHAN against the current state-of-the-art approaches.