Abstract:Automated choreography advances by generating dance from music. Current methods create skeleton keypoint sequences, not full dance videos, and cannot make specific individuals dance, limiting their real-world use. These methods also need precise keypoint annotations, making data collection difficult and restricting the use of self-made video datasets. To overcome these challenges, we introduce a novel task: generating dance videos directly from images of individuals guided by music. This task enables the dance generation of specific individuals without requiring keypoint annotations, making it more versatile and applicable to various situations. Our solution, the Dance Any Beat Diffusion model (DabFusion), utilizes a reference image and a music piece to generate dance videos featuring various dance types and choreographies. The music is analyzed by our specially designed music encoder, which identifies essential features including dance style, movement, and rhythm. DabFusion excels in generating dance videos not only for individuals in the training dataset but also for any previously unseen person. This versatility stems from its approach of generating latent optical flow, which contains all necessary motion information to animate any person in the image. We evaluate DabFusion's performance using the AIST++ dataset, focusing on video quality, audio-video synchronization, and motion-music alignment. We propose a 2D Motion-Music Alignment Score (2D-MM Align), which builds on the Beat Alignment Score to more effectively evaluate motion-music alignment for this new task. Experiments show that our DabFusion establishes a solid baseline for this innovative task. Video results can be found on our project page: <a class="link-external link-https" href="https://DabFusion.github.io" rel="external noopener nofollow">this https URL</a>.

Dance2MIDI: Dance-driven multi-instrument music generation

Dance2MIDI: Dance-driven multi-instruments music generation

Dance2Music: Automatic Dance-driven Music Generation

Dance2Music-Diffusion: leveraging latent diffusion models for music generation from dance videos

EnchantDance: Unveiling the Potential of Music-Driven Dance Movement

Music2Dance: DanceNet for Music-Driven Dance Generation

Example-Based Automatic Music-Driven Conventional Dance Motion Synthesis

Dance Any Beat: Blending Beats with Visuals in Dance Video Generation

Dance with Melody: An LSTM-autoencoder Approach to Music-oriented Dance Synthesis.

DanceIt: Music-Inspired Dancing Video Synthesis

TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration

Robust Dancer: Long-term 3D Dance Synthesis Using Unpaired Data

Dance-to-Music Generation with Encoder-based Textual Inversion

DanceMeld: Unraveling Dance Phrases with Hierarchical Latent Codes for Music-to-Dance Synthesis

Dancing to Music

Bidirectional Autoregressive Diffusion Model for Dance Generation

Quantized GAN for Complex Music Generation from Dance Videos

BeatDance: A Beat-Based Model-Agnostic Contrastive Learning Framework for Music-Dance Retrieval

Automatic Translation of Music-to-Dance for In-Game Characters.