Corpus Construction for Aviation Speech Recognition

Yiyi Cui,Zhen Wang,Yanyu Lu,Shan Fu
DOI: https://doi.org/10.1007/978-3-031-05409-9_18
2022-01-01
Abstract:In the aviation field, safety is always the top priority. Human error is one of the important factors affecting flight safety, which may cause serious consequences. Since voice conversation is the primary way of communication in the cockpit, speech recognition technology can be applied to detect possible human error, and this technology requires carefully annotated speech text for model training. This paper proposes a small-scale Chinese civil aviation professional corpus, which is constructed from civil aviation flight manuals and speech data in real flight scenarios with manual audio filtering and text annotation, thorough preprocessing and cleansing procedures. Besides, we identify keywords to extract the corpus, so that we can make the topics of our corpus more focused on the aviation domain, thus allowing the model to better learn the unique features of aviation speech such as aviation terms, quantity words, etc. Moreover, we contrast the WER of the speech recognition results before and after using our corpus. The experimental results have shown that our proposed corpus can improve the effect of aviation speech recognition.
What problem does this paper attempt to address?