Keyword-specific normalization based keyword spotting for spontaneous speech

Li W,Liao Q.
DOI: https://doi.org/10.1109/ISCSLP.2012.6423490
2012-01-01
Abstract:This paper presents a novel architecture for keyword spotting in spontaneous speech, in which keyword model is trained from a small number of acoustic examples provided by a user. The word-spotting architecture relies on scoring patch feature vector sequences extracted by using sliding windows, and performing keyword-specific normalization and threshold setting. Dynamic time warping (DTW) based template matching and Gaussian Mixture Models (GMM) are proposed to model the keyword, and another GMM is proposed to model the non-keywords. Our keyword spotting experiments demonstrate the effectiveness of the proposed methods. More specifically, the proposed GMM log-likelihood ratio based method achieves about 17% absolute improvement in terms of recall rates compared to the baseline system.
What problem does this paper attempt to address?