SteelyGAN: Semantic Unsupervised Symbolic Music Genre Transfer.

Zhaoxu Ding,Xiang Liu,Guoqiang Zhong,Dong Wang
DOI: https://doi.org/10.1007/978-3-031-18907-4_24
2022-01-01
Abstract:Recent progress in music genre transfer is greatly influenced by unpaired image-to-image transfer models. However, state-of-the-art unpaired music genre transfer models sometimes cannot keep the basic structure of the original song after genre transfer. In this paper, we propose SteelyGAN, a music genre transfer model that performs style transfer on both pixel-level (2D piano rolls) and latent-level (latent variables), by combining latent space classification loss and semantic consistency loss with cycle-connected generative adversarial networks. We also focus on music generation in individual bars of music with the novel Bar-Unit structure, in order to reduce coupling of music data within a 4-bar segment. We propose a new MIDI dataset, the Free MIDI Library, which features less data duplication and more comprehensive meta-data than other music genre transfer datasets. According to experiments and evaluations we perform separately on three pairs of music genres, namely Metal ↔ Country, Punk ↔ Classical and Rock ↔ Jazz, transferred and cycle-transferred music data generated by SteelyGAN have achieved higher classification accuracy, as well as better objective and subjective evaluation results than those generated by other state-of-the-art models.
What problem does this paper attempt to address?