Subband Energy Distance Measure Applied in Multi-Pass Speech/Non-Speech Discrimination

Wei Chu,Jia Liu
DOI: https://doi.org/10.1109/isspa.2007.4555466
2007-01-01
Abstract:This paper proposes a novel Subband Energy (SBE) distance measure to describe the differences between heterogeneous segments, and applies it in multi-pass speech/non-speech discrimination. The first pass of the discrimination is a segmentation stage based on Bayesian Information Criterion (BIC). The second pass is a classification stage employing a Gaussian Mixture Model (GMM) classifier. The third pass is a post-processing procedure which is efficient in acquiring precise boundaries between heterogeneous segments using SBE distance measure. A front-end speech/non-speech discriminator is built to extract speech segments from the broadcast news data and provide these speech segments as input for the subsequent module. Experiments conducted on the National Broadcast News corpus have proved the feasibility and effectiveness of our method. The overall frame misclassification rate is controlled below 0.8%.
What problem does this paper attempt to address?