HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks

Filip Szatkowski,Karol J. Piczak,Przemysław Spurek,Jacek Tabor,Tomasz Trzciński
2024-01-26
Abstract:Implicit neural representations (INRs) are a rapidly growing research field, which provides alternative ways to represent multimedia signals. Recent applications of INRs include image super-resolution, compression of high-dimensional signals, or 3D rendering. However, these solutions usually focus on visual data, and adapting them to the audio domain is not trivial. Moreover, it requires a separately trained model for every data sample. To address this limitation, we propose HyperSound, a meta-learning method leveraging hypernetworks to produce INRs for audio signals unseen at training time. We show that our approach can reconstruct sound waves with quality comparable to other state-of-the-art models.
Sound,Artificial Intelligence,Machine Learning,Neural and Evolutionary Computing,Audio and Speech Processing
What problem does this paper attempt to address?