Statistical Voice Activity Detection Based on Sparse Representation over Learned Dictionary

Shi-Wen Deng,Ji-Qing Han
DOI: https://doi.org/10.1016/j.dsp.2013.03.005
IF: 2.92
2013-01-01
Digital Signal Processing
Abstract:In this paper, we present a novel approach to voice activity detection (VAD) based on the sparse representation of an input noisy speech over a learned dictionary. First, we investigate the relationship between the signal detection and the sparse representation based on the Bayesian framework. Second, we derive the decision rule and an adaptive threshold based on a likelihood ratio test, by modeling the non-zero elements in the sparse representation as a Gaussian distribution. The experimental results show that the proposed approach outperforms the current statistical model-based methods, such as Gaussian, Laplacian, and Gamma, under white, babble, and vehicle noise conditions.
What problem does this paper attempt to address?