Abstract:We present a new approach for understanding the periodicity structure and semantics of motion datasets, independently of the morphology and skeletal structure of characters. Unlike existing methods using an overly sparse high-dimensional latent, we propose a phase manifold consisting of multiple closed curves, each corresponding to a latent amplitude. With our proposed vector quantized periodic autoencoder, we learn a shared phase manifold for multiple characters, such as a human and a dog, without any supervision. This is achieved by exploiting the discrete structure and a shallow network as bottlenecks, such that semantically similar motions are clustered into the same curve of the manifold, and the motions within the same component are aligned temporally by the phase variable. In combination with an improved motion matching framework, we demonstrate the manifold's capability of timing and semantics alignment in several applications, including motion retrieval, transfer and stylization. Code and pre-trained models for this paper are available at <a class="link-external link-https" href="https://peizhuoli.github.io/walkthedog" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The main goal of this paper is to propose a new method for understanding and processing motion data of different characters (such as humans, dogs, etc.), with a particular focus on the periodic structure and semantics of the motion data. These processing methods are independent of the specific morphology and skeletal structure of the characters. The authors address the limitations of existing methods, such as reliance on sparse high-dimensional latent representations or the need for explicit supervision signals, by introducing a new concept called the "disconnected 1D manifold." Specifically, the contributions of the paper are as follows: 1. **Novel Phase Manifold Design**: A new type of phase manifold is designed that can align both time and semantics. This manifold is shown to be compact, separable, and highly structured. 2. **Unsupervised Alignment**: It demonstrates how to achieve alignment between heterogeneous datasets using a narrow bottleneck and the intrinsic structure of motion, without any explicit supervision, self-supervised loss, or skeletal structure correspondence. 3. **Applications**: Through an improved motion matching framework, functionalities such as motion retrieval, transfer, and stylization are realized on the phase manifold. To achieve these goals, the authors propose a method called the "Vector Quantized Periodic Autoencoder" (VQ-PAE). This method can embed the motions of different characters into a shared phase manifold by learning the phase and amplitude information of each motion sequence and mapping it onto a manifold composed of multiple closed curves. These closed curves correspond to different discrete amplitude vectors, and motions with similar semantics are embedded onto the same closed curve. Additionally, the paper introduces a frequency-scaled motion matching algorithm to improve the responsiveness and smoothness of motion matching, especially when dealing with motions of different frequencies. In summary, this paper addresses the problem of cross-character motion alignment and provides a general and effective solution capable of handling motion data of characters with different morphologies and skeletal structures.

WalkTheDog: Cross-Morphology Motion Alignment via Phase Manifolds

Learning Visually Aligned Semantic Graph for Cross-Modal Manifold Matching.

MoManifold: Learning to Measure 3D Human Motion via Decoupled Joint Acceleration Manifolds

Motion In-Betweening with Phase Manifolds

3D Human Motion Synthesis Based on Nonlinear Manifold Learning

Spatio-temporal Manifold Learning for Human Motions via Long-horizon Modeling

Ponymation: Learning Articulated 3D Animal Motions from Unlabeled Online Videos

Generative Motion Stylization of Cross-structure Characters within Canonical Motion Space

ACE: Adversarial Correspondence Embedding for Cross Morphology Motion Retargeting from Human to Nonhuman Characters

Special considerations in the pediatric use of radionuclides for kidney studies.

Holistic-Motion2D: Scalable Whole-body Human Motion Generation in 2D Space

Composable Semi parametric Modelling for Long range Motion Generation

Learning Human Motion from Monocular Videos via Cross-Modal Manifold Alignment

HuMoT: Human Motion Representation using Topology-Agnostic Transformers for Character Animation Retargeting

Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop

Articulated Human Detection with Flexible Mixtures of Parts

[Eicosanoids released in vitro by human alveolar macrophages from normal subjects and asthmatics].

PMotion: An advanced markerless pose estimation approach based on novel deep learning framework used to reveal neurobehavior

Language-Assisted Human Part Motion Learning for Skeleton-Based Temporal Action Segmentation

Pose-Aware Attention Network for Flexible Motion Retargeting by Body Part

Articulated point pattern matching in optical motion capture systems