Applying Multitask Learning To Acoustic-Phonemic Model For Mispronunciation Detection And Diagnosis In L2 English Speech

Shaoguang Mao,Zhiyong Wu,Runnan Li,Xu Li,Helen Meng,Lianhong Cai
DOI: https://doi.org/10.1109/ICASSP.2018.8461841
2018-01-01
Abstract:For mispronunciation detection and diagnosis (MDD), nowadays approaches generally treat the phonemes in correct and mispronunciations as the same despite the fact they may actually carry different characteristics. Furthermore, serious data imbalance issue between correct and mispronunciation in dataset further influences the performances. To address these problems, this paper investigates the use of multi-task (MT) learning technique to enhance the acoustic-phonemic model (APM) for MDD. The phonemes in correct and mispronunciations are processed separately but in multi-task manner considering both correct and mispronunciation recognition tasks. A feature representation module is further proposed to improve performance. Compared with baseline APM, the proposed MT-APM, R-MT-APM achieve better performance not only in Precision, Recall and F -Measure, but also in mispronunciation detection and diagnosis accuracies. With feature representation module, R-MT-APM achieves the highest mispronunciation detection accuracy.
What problem does this paper attempt to address?