3D Shape Classification Based on Global and Local Features Extraction with Collaborative Learning

Bo Ding,Libao Zhang,Yongjun He,Jian Qin
DOI: https://doi.org/10.1007/s00371-023-03098-0
2024-01-01
Abstract:It is important to extract both global and local features for view-based 3D shape classification. Therefore, we propose a 3D shape classification method based on global and local features extraction with collaborative learning. This method consists of a patch-level transformer sub-network (PTS) and a view-level transformer sub-network (VTS). In the PTS, a single view is divided into multiple patches. And a multi-layer transformer encoder is employed to accurately highlight discriminative patches and capture correlations among patches in a view, which can efficiently filter out the meaningless information and enhance meaningful information. The PTS can aggregate patch features into a 3D shape representation with rich local details. In the VTS, a multi-layer transformer encoder is employed to assign different attention to each view and obtain the contextual relationship among views, which can highlight the discriminative views among all the views of the same 3D shape and efficiently aggregate view features into a 3D shape representation. A collaborative loss is applied to encourage the two branches to learn collaboratively and teach each other in training. Experiments on two 3D benchmark datasets show that our proposed method outperforms current methods.
What problem does this paper attempt to address?