Learning Disentangled Audio Representations through Controlled Synthesis

Yusuf Brima,Ulf Krumnack,Simone Pika,Gunther Heidemann
2024-02-16
Abstract:This paper tackles the scarcity of benchmarking data in disentangled auditory representation learning. We introduce SynTone, a synthetic dataset with explicit ground truth explanatory factors for evaluating disentanglement techniques. Benchmarking state-of-the-art methods on SynTone highlights its utility for method evaluation. Our results underscore strengths and limitations in audio disentanglement, motivating future research.
Sound,Machine Learning,Audio and Speech Processing
What problem does this paper attempt to address?