Speaker gender recognition based on combining the contribution of MFCC and pitch features

庞程,李晓飞,刘宏
DOI: https://doi.org/10.13245/j.hust.2013.s1.033
2013-01-01
Abstract:A speaker gender identification method was presented based on the contribution of MFCC (Mel frequency cepstral coeffcients) and pitch under complex scenarios. This method effectively combines the Mel frequency cepstral coefficient template matching method and discrimination method based on pitch. The voicebox of this system includes 5 000 isolated word speeches and 1 260 speeches with emotion. The recognition rate of speaker gender in quiet environment can reach 98.88%, and the rate of the speaker gender after spectral subtraction is 90.2% under the babble noise environment when the SNR is 10 dB. The experiment also indicates that emotion has a great effect on speaker gender recognition, especially male voice.
What problem does this paper attempt to address?