PAGAN: A Phase-Adapted Generative Adversarial Networks for Speech Enhancement

Peishuo Li,Zihang Jiang,Shouyi Yin,Dandan Song,Peng Ouyang,Leibo Liu,Shaojun Wei
DOI: https://doi.org/10.1109/icassp40776.2020.9054256
2020-01-01
Abstract:Deep neural networks (DNNs) are becoming more and more popular in speech enhancement. Most of DNN-based speech enhancement approaches currently operate on magnitude spectra and ignore the phase mismatch between noisy and clean speech which greatly limits the speech enhancement performance. This paper presents a new approach to solve the phase mismatch problem by training traditional DNN adversarially with a time-domain discriminator. Instead of estimating a more accurate phase, the DNN is trained to be more adapted to noisy phase and able to minimize the influence brought by the phase mismatch. We also propose a new evaluation metric to judge the degree of adaptation to noisy phase. Experimental results show that adding of time-domain discriminator yields a more phase-adapted generator and significantly improves the speech enhancement performance.
What problem does this paper attempt to address?