Multiscale Mesh Deformation Component Analysis with Attention-based Autoencoders

Jie Yang,Lin Gao,Qingyang Tan,Yihua Huang,Shihong Xia,Yu-Kun Lai
DOI: https://doi.org/10.48550/arXiv.2012.02459
2020-12-04
Abstract:Deformation component analysis is a fundamental problem in geometry processing and shape understanding. Existing approaches mainly extract deformation components in local regions at a similar scale while deformations of real-world objects are usually distributed in a multi-scale manner. In this paper, we propose a novel method to exact multiscale deformation components automatically with a stacked attention-based autoencoder. The attention mechanism is designed to learn to softly weight multi-scale deformation components in active deformation regions, and the stacked attention-based autoencoder is learned to represent the deformation components at different scales. Quantitative and qualitative evaluations show that our method outperforms state-of-the-art methods. Furthermore, with the multiscale deformation components extracted by our method, the user can edit shapes in a coarse-to-fine fashion which facilitates effective modeling of new shapes.
Graphics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to automatically extract multi - scale deformation components from a set of deformed shapes to support coarse - to - fine shape editing?** Existing methods mainly focus on extracting deformation components in local areas of similar scales, while object deformations in the real world are usually multi - scale distributed. For example, a person's facial expression is a local small - scale deformation, while the bending of the whole body is a large - scale deformation. To solve this problem, the author proposes a new method based on a stacked attention - based autoencoder. This method can automatically extract multi - scale deformation components and learn to weight the multi - scale deformation components in active deformation areas through the attention mechanism. Specifically, the main contributions of this method include: 1. **Automatically extract multi - scale deformation components for the first time**: Users can use these extracted components to edit 3D mesh shapes in a coarse - to - fine manner, making 3D modeling more efficient. 2. **Novel deep architecture**: A stacked autoencoder architecture involving an attention mechanism is proposed to decompose deformation components of different scales from a set of shapes. ### Method Overview - **Input Representation**: The recently proposed ACAP (as - consistent - as - possible) deformation representation method is used. This method can handle large - scale deformations and is only defined on vertices, making grid - based convolution easier to implement. - **Network Structure**: A stacked autoencoder structure is adopted. The first - layer autoencoder AE0 is responsible for extracting large - scale deformation components, and the second - layer K autoencoders AEk (1 ≤ k ≤ K) focus on different sub - regions to extract small - scale deformation components. - **Attention Mechanism**: Attention masks are generated through the parameters C of the fully - connected layer, enabling the second - layer autoencoder to focus on specific local areas and thus extract more refined deformation components. - **Redundant Component Removal**: To ensure that the results are concise and reasonable, redundant components that contain slight or no deformation information are removed. ### Mathematical Formulas 1. **Reconstruction Loss**: \[ L_{\text{recon}}=\frac{1}{N}\sum_{i = 1}^{N}\|X_i-\hat{X}_i\|_2^2 \] 2. **Sparse Loss**: \[ \Omega(C)=\frac{1}{K_z}\sum_{k = 1}^{K_z}\sum_{i = 1}^{V}\Lambda_{ik}\|C_{k,i}\|_2 \] where \(\Lambda_{ik}\) is the sparse regularization parameter, defined as: \[ \Lambda_{ik}=\begin{cases} 0&\text{if }d_{ik}<d\\ 1&\text{if }d_{ik}\geq d \end{cases} \] \(d_{ik}\) represents the normalized geodesic distance from vertex i to the center point of component k. 3. **Non - trivial Regularization Term**: \[ V(Z)=\frac{1}{K_z}\sum_{j = 1}^{K_z}\max((\max_m|Z_{jm}|-\theta),0) \] Through these methods, the paper successfully solves the problem of multi - scale deformation component extraction and provides new tools and ideas for shape editing.