Parallel Phone Recognizer Based MLLR Speaker Recognition

Wang Eryu,Guo Wu,Dai Lirong
DOI: https://doi.org/10.1109/chinsl.2008.ecp.91
2008-01-01
Abstract:The method that uses maximum-likelihood linear regression (MLLR) adaptation transformation as features for support vector machine (SVM) has been adopted in recent NIST Speaker Recognition Evaluation (SRE). It is attractive because it makes use of high-level information about the speakers, and it can complement the standard GMM-UBM system. The performance of the system will be affected by the phone recognizer, especially in multi-lingual contexts. In this paper, we use a multi language phone recognizer based MLLR-SVM system, which can deal with the language phone recognizer problem. This system is defined as parallel phone recognizer-MLLR (PPR-MLLR). It has simpler framework than existing MLLR methods and can achieve better performance. In the NIST SRE 06 1 conv4w-1 conv4w task, the system can achieve an EER of 5.44%. Furthermore, we can achieve an EER of 4.20% which is almost a 20% system performance improvement when combined with the cepstral GMM-UBM system.
What problem does this paper attempt to address?