Pan: Phoneme-Aware Network For Monaural Speech Enhancement

Zhihao Du,Ming Lei,Jiqing Han,Shiliang Zhang
DOI: https://doi.org/10.1109/ICASSP40776.2020.9054334
2020-01-01
Abstract:Current methods for monaural speech enhancement only utilize acoustic information but seldom consider the phonetic information of an utterance. In the voice conversion community, significant progress has been achieved by using the phonetic information via the phonetic posteriorgrams (PPGs). Inspired by the progress, we propose a phoneme-aware network (PAN) to utilize the noisy PPGs for speech enhancement. Since the PPG prediction and speech enhancement benefit from each other, a PPG predictor is involved into the PAN and an iterative training algorithm is proposed for PAN. Experimental results show that the enhancement performance is improved by using the phonetic information in terms of speech intelligibility, perceptual quality and character error rate. To the best of our knowledge, this is the first time to introduce the PPG into speech enhancement.
What problem does this paper attempt to address?