Detection of Glottal Closure Instants from Speech Signals: A Convolutional Neural Network Based Method

Shuai Yang,Zhiyong Wu,Binbin Shen,Helen Meng
DOI: https://doi.org/10.21437/interspeech.2018-1281
2018-01-01
Abstract:Most conventional methods to detect glottal closure instants (GCI) are based on signal processing technologies and different GCI candidate selection methods. This paper proposes a classification method to detect glottal closure instants from speech waveforms using convolutional neural network (CNN). The procedure is divided into two successive steps. Firstly, a low-pass filtered signal is computed, whose negative peaks are taken as candidates for GCI placement. Secondly, a CNN-based classification model determines for each peak whether it corresponds to a GCI or not. The method is compared with three existing GCI detection algorithms on two publicly available databases. For the proposed method, the detection accuracy in terms of F1-score is 98.23%. Additional experiment indicates that the model can perform better after trained with the speech data from the speakers who are the same as those in the test set.
What problem does this paper attempt to address?