Robust Speaker Modeling Based on Constrained Nonnegative Tensor Factorization

Qiang Wu,Liqing Zhang,Guangchuan Shi
DOI: https://doi.org/10.1007/978-3-540-87732-5_2
2008-01-01
Abstract:Nonnegative tensor factorization is an extension of nonnegative matrix factorization(NMF) to a multilinear case, where nonnegative constraints are imposed on the PARAFAC/Tucker model. In this paper, to identify speaker from a noisy environment, we propose a new method based on PARAFAC model called constrained Nonnegative Tensor Factorization (cNTF). Speech signal is encoded as a general higher order tensor in order to learn the basis functions from multiple interrelated feature subspaces. We simulate a cochlear-like peripheral auditory stage which is motivated by the auditory perception mechanism of human being. A sparse speech feature representation is extracted by cNTF which is used for robust speaker modeling. Orthogonal and nonsmooth sparse control constraints are further imposed on the PARAFAC model in order to preserve the useful information of each feature subspace in the higher order tensor. Alternating projection algorithm is applied to obtain a stable solution. Experiments results demonstrate that our method can improve the recognition accuracy specifically in noise environment.
What problem does this paper attempt to address?