Learning scale-aware relationships via Laplacian decomposition-based transformer for 3D human pose estimation

Jeonghwan Kim,Hyukmin Kwon,Seong Yong Lim,Wonjun Kim
DOI: https://doi.org/10.1007/s00530-023-01216-5
IF: 3.9
2024-01-19
Multimedia Systems
Abstract:This paper presents a parameter-free method for 3D human pose estimation via the Laplacian decomposition-based transformer. The non-local interactions between 3D mesh vertices of the whole body are effectively estimated in transformer-based approaches while the graph model also has begun to be embedded into the transformer for consideration of neighborhood interactions in the kinematic topology. Even though such combination has shown the remarkable progress in 3D human pose estimation, scale-aware relationships between body parts are not sufficiently explored in literature. To supplement this point, we propose to apply the Laplacian pyramid module to the transformer, which decomposes encoded features into Laplacian residuals of different scale spaces. Furthermore, we separately compute self-attentions according to body parts for generating more natural human poses. Experimental results on benchmark datasets show that the proposed method successfully improves the performance of 3D human pose estimation. The code and model are publicly available at: https://github.com/DCVL-3D/Laphormer_release.
computer science, information systems, theory & methods
What problem does this paper attempt to address?