Underdetermined Blind Source Separation of Speech Mixtures Unifying Dictionary Learning and Sparse Representation.
Xie, Yuan,Xie, Kan,Xie, Shengli
DOI: https://doi.org/10.1007/s13042-021-01406-5
2021-01-01
International Journal of Machine Learning and Cybernetics
Abstract:Underdetermined blind source separation of speech mixtures is a challenging issue in the classical “Cocktail-party” problem. Recently, there has been attention to use dictionary learning to solve this problem. In this paper, we build a novel framework to solve the underdetermined blind separation of speech mixtures as a sparse signal recovery problem by using a compressed sensing model. First, to eliminate the influence of additive white Gaussian noise, a wavelet transform with tunable Q-factor is used as noise reduction pretreatment. Second, to obtain an accurate mixing matrix estimation, a blind identification method is designed by identifying single source data. Third, to find the best dictionary to represent the training signals, an arbitrary subset of codewords and the corresponding coefficients are updated simultaneously. In the source signal recovery stage, a block processing is used into the mixing signals so that the source components are separated from each block by using sparse representation. Then, the whole source signals are reconstructed by concatenating the separated source components from all the block. The advantage is reducing the computational complexity. Finally, experimental results by separating the underdetermined speech mixtures demonstrate the superiority of the proposed algorithm.