WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds

Peizhuo Li,Sebastian Starke,Yuting Ye,Olga Sorkine-Hornung
DOI: https://doi.org/10.1145/3641519.3657508
2024-07-11
Abstract:We present a new approach for understanding the periodicity structure and semantics of motion datasets, independently of the morphology and skeletal structure of characters. Unlike existing methods using an overly sparse high-dimensional latent, we propose a phase manifold consisting of multiple closed curves, each corresponding to a latent amplitude. With our proposed vector quantized periodic autoencoder, we learn a shared phase manifold for multiple characters, such as a human and a dog, without any supervision. This is achieved by exploiting the discrete structure and a shallow network as bottlenecks, such that semantically similar motions are clustered into the same curve of the manifold, and the motions within the same component are aligned temporally by the phase variable. In combination with an improved motion matching framework, we demonstrate the manifold's capability of timing and semantics alignment in several applications, including motion retrieval, transfer and stylization. Code and pre-trained models for this paper are available at <a class="link-external link-https" href="https://peizhuoli.github.io/walkthedog" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
The main goal of this paper is to propose a new method for understanding and processing motion data of different characters (such as humans, dogs, etc.), with a particular focus on the periodic structure and semantics of the motion data. These processing methods are independent of the specific morphology and skeletal structure of the characters. The authors address the limitations of existing methods, such as reliance on sparse high-dimensional latent representations or the need for explicit supervision signals, by introducing a new concept called the "disconnected 1D manifold." Specifically, the contributions of the paper are as follows: 1. **Novel Phase Manifold Design**: A new type of phase manifold is designed that can align both time and semantics. This manifold is shown to be compact, separable, and highly structured. 2. **Unsupervised Alignment**: It demonstrates how to achieve alignment between heterogeneous datasets using a narrow bottleneck and the intrinsic structure of motion, without any explicit supervision, self-supervised loss, or skeletal structure correspondence. 3. **Applications**: Through an improved motion matching framework, functionalities such as motion retrieval, transfer, and stylization are realized on the phase manifold. To achieve these goals, the authors propose a method called the "Vector Quantized Periodic Autoencoder" (VQ-PAE). This method can embed the motions of different characters into a shared phase manifold by learning the phase and amplitude information of each motion sequence and mapping it onto a manifold composed of multiple closed curves. These closed curves correspond to different discrete amplitude vectors, and motions with similar semantics are embedded onto the same closed curve. Additionally, the paper introduces a frequency-scaled motion matching algorithm to improve the responsiveness and smoothness of motion matching, especially when dealing with motions of different frequencies. In summary, this paper addresses the problem of cross-character motion alignment and provides a general and effective solution capable of handling motion data of characters with different morphologies and skeletal structures.