Integration of Knowledge-Driven and Data-Driven Based Korean Phonetic Transcription
Dezhi Cao,Licheng Wu,Yue Zhao,Zhenna Lu
DOI: https://doi.org/10.1109/prai55851.2022.9904216
2022-01-01
Abstract:In the construction of resources for Korean phonetic information processing, automatic phonetic transcription technology plays a crucial role. However, because Korean word formation is extremely powerful and new words emerge regularly, it is impossible to create a database that contains all of the words. As a result, in addition to the words in the database, how to solve the pronunciation annotation of those unregistered words outside the database, referred to as OOV (out of vocabulary) words, has turned into a problem that must be resolved in the process of Korean natural language processing. The current academic approaches to grapheme-to-phoneme (G2P) conversion techniques have been commonly knowledge-based or data-based. Previously, the methods which were only knowledge-driven based are difficult to adapt to the actual situation of a large amount of data information. The data-driven approach relies solely on high-quality data, makes it difficult to reasonably determine the input variables, and necessitates the use of adequate and precisely selected model features. To address these issues, the paper proposes a knowledge-driven and data-driven fusion based an automatic Korean language G2P method. Firstly, we extract eigenvalues based on the pronunciation rules and the phonetic changing rules between words in Korean, and feed them in to the model for training. And then, the model is trained to achieve the automatic phonetic transcription for Korean using the data-driven model, which can better fit the mapping relationship between input and output variables. The proposed model can reflect the phonological changes in the Korean continuous speech stream, and can accurately obtain the phonemes corresponding to the graphemes. The method has been cross-validated for validity and superiority to improve model performance, and the average accuracy on grapheme-to-phoneme conversion can reach 94.63%.