Chinese Dialect Speech Recognition Based on End-to-end Machine Learning

Xinyue Quan,Fengrun Zhang,Xiang Xie
DOI: https://doi.org/10.1109/MLCR57210.2022.00012
2022-10-01
Abstract:With the development of End-to-end neural network, End-to-end speech recognition has achieved comparable performance with traditional speech recognition methods. The End-to-end speech recognition model only needs the speech features of the input and the text information of the output. This paper takes advantage of the End-to-end method and uses the dataset provided by the Oriental Language Recognition Challenge to build a Chinese dialect recognition system for Sichuanese, Hokkien, Shanghainese and Cantonese. Dialect data belongs to low-resource languages. In this paper, in view of the lack of dialect data resources, a method of adding unrelated languages for joint training and adding Chinese language model for joint decoding is proposed for dialect speech recognition. The model has a relative improvement of 12% in Character Error Rate compared with the Baseline systerm.
Linguistics,Computer Science
What problem does this paper attempt to address?