Meshformers: Transformer-Based Networks for Mesh Understanding

Hao-Yang Peng,Meng-Hao Guo,Zheng-Ning Liu,Yong-Liang Yang,Tai-Jiang Mu
DOI: https://doi.org/10.2139/ssrn.4313526
2022-01-01
SSRN Electronic Journal
Abstract:Polygonal mesh has been proven to be a powerful representation of 3D shapes, given its efficiency in expressing shape surface while maintaining geometric and topological information. Increasing efforts have been made to design elaborate deep convolutional neural networks for meshes. However, these methods naturally ignore the global connectivity among mesh primitives due to the locality nature of convolution operations. \revise{In this paper, we for the first time introduce a transformer-like self-attention mechanism with downsampling architectures for mesh learning to capture both the global and local relationships within the mesh faces.} To achieve this, we propose BFS-Pooling, which can convert an irregular mesh into discrete tokens (i.e., a set of adjacent faces) with breath-first-search (BFS) and naturally build hierarchical architectures for mesh learning by pooling mesh tokens. Benefiting from BFS-Pooling, several advanced transformer architectures for 2D images can be easily adapted to mesh data as MeshFormers, including MeshViT, MeshWin and MeshPVT. Experimental results demonstrate that MeshFormers achieve the best or competitive performance in both mesh classification and mesh segmentation tasks. Code will be available.
What problem does this paper attempt to address?