Learning a hierarchical dictionary for single-channel speech separation

Guangzhao Bao,Yangfei Xu,Xu Xu,Zhongfu Ye
DOI: https://doi.org/10.1109/SSP.2014.6884679
2014-01-01
Abstract:This paper presents a novel algorithm for learning a hierarchical dictionary in the short-time Fourier (STFT) domain, which can improve the performance of dictionary learning (DL) based single-channel speech separation (SCSS). The goal of SCSS is to separate the underlying clean speeches from a signal mixture, which was often achieved by learning a pair of discriminative sub-dictionaries and sparsely coding the mixture speech signal over the dictionary pair. The case of 2 source speech signals is considered in this paper. Unfortunately, the existing DL approaches cannot avoid the source confusion drastically, i.e., when we sparsely represent the mixture signal over the dictionary pair, parts of the object speech component are explained by interferer speech dictionary atoms and vice-versa. In order to suppress more source confusion, we divide the training sets into two layers of components and learn hierarchical sub-dictionaries using different layers. Experimental testing is shown to verify the superior performance compared with other existing approaches.
What problem does this paper attempt to address?