Modeling method and modeling device for language identification

Liang He,Weiqiang Zhang,Jia Liu
2012-01-01
Abstract:The embodiment of the invention provides a modeling method for language identification, which comprises the following steps of: inputting voice data, preprocessing the voice data to obtain a characteristic sequence, mapping a characteristic vector to form a super vector, performing projection compensation on the super vector, and establishing a training language model through an algorithm of a support vector machine; and adopting the steps to obtain a super vector to be measured of the voice to be measured, performing the projection compensation on the super vector to be measured, grading thesuper vector to be measured by utilizing the language model, and identifying language types of the voice to be measured. The embodiment of the invention also provides a modeling device for the language identification, which comprises a voice preprocessing module, a characteristic extraction module, a multi-coordinate system origin selection module, a characteristic vector mapping module, a subspace extraction module, a subspace projection compensation module, a training module and an identification module. According to the method and the device which are provided by the embodiment of the invention, information which is invalid to the identification in high-dimension statistics is removed, the correction rate of the language identification is improved, and the computational complexity on an integrated circuit is reduced.
What problem does this paper attempt to address?