Gender Identification using MFCC for Telephone Applications - A Comparative Study

Jamil Ahmad,Mustansar Fiaz,Soon-il Kwon,Maleerat Sodanil,Bay Vo,Sung Wook Baik
DOI: https://doi.org/10.48550/arXiv.1601.01577
2016-01-07
Abstract:Gender recognition is an essential component of automatic speech recognition and interactive voice response systems. Determining gender of the speaker reduces the computational burden of such systems for any further processing. Typical methods for gender recognition from speech largely depend on features extraction and classification processes. The purpose of this study is to evaluate the performance of various state-of-the-art classification methods along with tuning their parameters for helping selection of the optimal classification methods for gender recognition tasks. Five classification schemes including k-nearest neighbor, naïve Bayes, multilayer perceptron, random forest, and support vector machine are comprehensively evaluated for determination of gender from telephonic speech using the Mel-frequency cepstral coefficients. Different experiments were performed to determine the effects of training data sizes, length of the speech streams, and parameter tuning on classification performance. Results suggest that SVM is the best classifier among all the five schemes for gender recognition.
Sound
What problem does this paper attempt to address?