Blind Source Separation and Identification for Speech Signals

Jie Yin,Zhiliang Liu,Yaqiang Jin,Dandan Peng,Jinlong Kang
DOI: https://doi.org/10.1109/sdpc.2017.82
2017-01-01
Abstract:Background noise reduction has been studied for many years. However, unwanted human speech noise suppression is not well discussed due to sparsity of the speech signal. Traditional blind source separation (BSS) methods such as independent component analysis (ICA) assume the prior knowledge of the number of sources and require that the number of sources must equal the number of sensors. Above limitations prevent the practical use of speech enhancement using traditional BSS for mobile phone communication. In this paper, a combination method of BSS and speaker recognition system (SRS) is developed for target speech extraction in underdetermined cases. By estimating each independent speech from speech mixture using binary mask over time-frequency (T- F) domain, clean speeches can be separated. By comparing Mel frequency cepstral coefficients (MFCC) of each separated clean speech with the trained MFCC, distortions can be calculated out. The separated clean speech with the smallest distortion is regarded as the target speech. Through a series of validations, optimum parameters for BSS and SRS are obtained. Additionally, the proposed method shows robustness in human-generated background noise suppression.
What problem does this paper attempt to address?