Design of accent classifier based on speech rhythm features

Droua-Hamdani, Ghania
DOI: https://doi.org/10.1007/s11042-023-14724-3
IF: 2.577
2023-02-22
Multimedia Tools and Applications
Abstract:Recognition systems suffer from significant performance degradation when operating in foreign accent conditions. Speech rhythm, which is considered the most discriminating prosody parameter, has been proposed in the study to help recognition systems overcome the mismatching issue. Thus, the study presents a new approach to evaluating the MLP classifier based on the speech rhythm of native and non-native speakers by involving statistical knowledge. The gender of speakers was also investigated. Nine measures of speech rate were computed using the most established rhythm models. The set of statistical analyses was conducted to obtain an overall picture of speakers' rhythm variability. For the recognition tasks, the engine was learned and tested using a set of combined rhythm metric vectors. The statistical analysis results helped us to explain some unexpected recognition outcomes and to choose the most suitable rhythm metrics to achieve our objectives. From all the experiments, the most appropriate rhythmic metric to categorize native and non-native speakers is the framework that combines VarcoX and CCI rhythmic measures. The recognition accuracy then reached about 81%. The efficiency of the system increased by about 9% compared to the other configurations. The performance of the system increases significantly (87%) when only male rhythm values are used.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?