Speaker Segmentation Based on Sparse Neural Network

Yong MA,Chang-chun BAO
DOI: https://doi.org/10.11936/bjutxb2014050063
2015-01-01
Abstract:A method of speaker segmentation based on sparse neural network is presented. The speaker factor feature is extracted using the sparse neural network of one hidden layer from the super-vector feature of speech signals, then the label of every speech frame obtained by K-means clustering is used to segment different speakers, and the problem of over-fitting is tackled by the dropout technology in the training process of sparse network. The performance evaluation on the multi-speaker audio stream corpus generated from the TIMIT databases shows that the performance of speaker segmentation is improved by increasing the number of sparse network's hidden nodes, and the proposed speaker segmentation algorithm based on the sparse neural network performs better than the Bayesian information criterion ( BIC) method and the sparse auto-encoder method.
What problem does this paper attempt to address?