Non-negative Tensor Factorization for Speech Enhancement

Liang He,Weiqiang Zhang,Mengnan Shi
DOI: https://doi.org/10.2991/icaita-16.2016.5
2016-01-01
Abstract:This paper proposes an algorithm for speech enhancement by non-negative tensor factorisation. We group adjacent time-frequency matrices in the spectrograms together to form a tensor as a basic input in our algorithm. The non-negative tensor factorisation is followed to perform sound source separation between speeches and noises. The proposed strategy benefits from both short time spectral analysis and long term information. From the consideration of auditory theory and linguistics, the latter preserves the temporal dynamics information and intrinsic structure of speech, which are important for the continuity and integrity of hearing. We collected several types of real-life noises and conducted experiments on the TIMIT database. Experimental results demonstrated that the segmental signal to noise ratio (SSNR) and the perceptual evaluation of speech quality (PESQ) were significantly improved respectively.
What problem does this paper attempt to address?