Deep NMF for speech separation

J. Hershey,F. Weninger,Jonathan Le Roux
DOI: https://doi.org/10.1109/ICASSP.2015.7177933
2015-04-01
Abstract:Non-negative matrix factorization (NMF) has been widely used for challenging single-channel audio source separation tasks. However, inference in NMF-based models relies on iterative inference methods, typically formulated as multiplicative updates. We propose “deep NMF”, a novel non-negative deep network architecture which results from unfolding the NMF iterations and untying its parameters. This architecture can be discriminatively trained for optimal separation performance. To optimize its non-negative parameters, we show how a new form of back-propagation, based on multiplicative updates, can be used to preserve non-negativity, without the need for constrained optimization. We show on a challenging speech separation task that deep NMF improves in terms of accuracy upon NMF and is competitive with conventional sigmoid deep neural networks, while requiring a tenth of the number of parameters.
Computer Science
What problem does this paper attempt to address?