Unsupervised Discovery of an Extended Phoneme Set in L2 English Speech for Mispronunciation Detection and Diagnosis.

Shaoguang Mao,Xu Li,Kun Li,Zhiyong Wu,Xunying Liu,Helen Meng
DOI: https://doi.org/10.1109/icassp.2018.8462635
2018-01-01
Abstract:Second language (L2) speech is often labelled with the native, phoneme categories. Hence, we often observe segments for which it is difficult, if not impossible, to decide on a categorical phoneme label. We refer to these segments as "non-categorical" phoneme units. Existing approaches to mispronunciation detection and diagnosis (MDD) mostly focus on categorical phoneme errors, where one native phoneme is substituted for another. However, non-categorical errors are not considered. To better represent L2 speech for improved MDD, this work aims to discover an Extended Phoneme Set in L2 speech (L2-EPS) which includes not only the categorical phonemes based on the native set, but also non-categorical phoneme units. We apply an optimized k-means algorithm to cluster phoneme-based phonemic posterior-grams (PPGs), which are generated through an acoustic-phonemic model (APM). Then we find the L2-EPS based on analysis of the clusters obtained. We verified experimentally that the non-categorical phonemes in L2-EPS can extend the native phoneme categories to better describe L2 speech. Hence L2-EPS can enrich the existing approaches to MDD for better performance.
What problem does this paper attempt to address?