Optimal prosodic feature extraction and classification in parametric excitation source information for Indian language identification using neural network based Q-learning algorithm

Himanish Shekhar Das,Pinki Roy
DOI: https://doi.org/10.1007/s10772-018-09582-6
2018-12-03
International Journal of Speech Technology
Abstract:Automatic language identification (LID) system has extensively recognized in a real world multilanguage speech specific applications. The formation speech is relying on the vocal tract area which explores the excitation source information for LID task. In this paper, LID system utilizes sub segmental, segmental and supra segmental features from Linear Prediction residual of speech signal, represents various native language speech excitation source information. The glottal flow derivative of speech signal is obtained through iterative adaptive inverse filtering method. Moreover, the prosodic features of speech signal are extracted using short time Fourier transform due to its capability to process non-stationary signals. Finally, the deep neural network based Q-learning (DNNQL) algorithm has been employed for identification of the class label for a specific language. Experimental validation of the proposed approach is carried out using Indian language recorded database. Finally, the proposed LID system approach is performing well with 97.3% accuracy compared to other machine learning based approaches.
What problem does this paper attempt to address?