Efficient FPGA implementation for sound source separation using direction-informed multichannel non-negative matrix factorization

Philipp Diel,Antonio J. Muñoz-Montoro,Julio J. Carabias-Orti,Jose Ranilla
DOI: https://doi.org/10.1007/s11227-024-05945-w
IF: 3.3
2024-03-08
The Journal of Supercomputing
Abstract:Sound source separation (SSS) is a fundamental problem in audio signal processing, aiming to recover individual audio sources from a given mixture. A promising approach is multichannel non-negative matrix factorization (MNMF), which employs a Gaussian probabilistic model encoding both magnitude correlations and phase differences between channels through spatial covariance matrices (SCM). In this work, we present a dedicated hardware architecture implemented on field programmable gate arrays (FPGAs) for efficient SSS using MNMF-based techniques. A novel decorrelation constraint is presented to facilitate the factorization of the SCM signal model, tailored to the challenges of multichannel source separation. The performance of this FPGA-based approach is comprehensively evaluated, taking advantage of the flexibility and computational capabilities of FPGAs to create an efficient real-time source separation framework. Our experimental results demonstrate consistent, high-quality results in terms of sound separation.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?