Noise Processing and Multitask Learning for Far-Field Dialect Classification.

Hai Wang,Chenguang Qin,Kan Zhang,Ling Gao,Jie Ren,Yan Wang,Yuhui Ma
DOI: https://doi.org/10.1109/cbd51900.2020.00034
2023-01-01
Abstract:Deep learning has made great achievements in the field of speech recognition. With the popularization of embedded devices such as Intelligent speaker and the demand for dialect interaction scenes, it poses great challenges to far-field speech recognition and dialect language recognition. In order to solve the dialect language recognition of embedded devices in far-field speech recognition, we propose a deep learning neural network model with multi-task learning. First, we apply the AQPA(audio qualitative pre-analysis) method on the raw data of ten local Chinese dialects to reduce the influencing factors of steady-state and non-steady-state signals. Then we define dialect recognition as the main task and dialect area as the auxiliary task, using the multi-task learning method to improve the accuracy of dialect classification. The experimental results show that our approach improves accuracy with an average of 20% when compared with the single-task model without noise reduction.
What problem does this paper attempt to address?