Phase Spectrum Recovery for Enhancing Low-Quality Speech Captured by Laser Microphones

Chang Liu,Yang Ai,Zhenhua Ling
DOI: https://doi.org/10.1109/iscslp49672.2021.9362112
2021-01-01
Abstract:This paper proposes a phase spectrum recovery method for enhancing the low-quality speech captured by laser micro-phones, which is degraded by non-additive distortions during signal acquisition. Our preliminary study shows that common speech enhancement methods based on amplitude spectrum estimation can not achieve a satisfactory performance on this task. Therefore, this paper designs a speech enhancement model which is comprised of an amplitude spectrum estimator (ASE) and a phase spectrum estimator (PSE). The ASE adopts autoregressive LSTMs and multi-target learning framework to predict clean amplitude spectra from noisy ones. The PSE first adopts a waveform-based model to enhance noisy speech in time domain, and then extracts phase spectra from the enhanced waveforms. Subsequently, the outputs of the two estimators are combined to reconstruct the final enhanced speech waveforms. Our experimental results demonstrate that our proposed method can achieve higher PESQ score than the method using only ASE and the waveform-based speech enhancement methods, including UNet and TCNN.
What problem does this paper attempt to address?