Advancements in Arabic Text-to-Speech Systems: A 22-Year Literature Review

Khansa Chemnad,Achraf Othman
DOI: https://doi.org/10.1109/access.2023.3260844
IF: 3.9
2023-04-01
IEEE Access
Abstract:Although there are several speech synthesis models available for different languages tailored to specific domain requirements and applications, there is currently no readily available information on the latest trends in Arabic language speech synthesis. This can make it challenging for beginners to research and develop text-to-speech (TTS) systems for Arabic. To address this issue, this article provides a comprehensive overview of several scholars' contributions to the field of Arabic TTS, along with an examination of the unique features of the Arabic language and the corresponding challenges in creating TTS systems. Reporting only on papers discussing Arabic TTS, this systematic review evaluated available literature published between 2000 and 2022. We conducted a systematic review of six databases using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines to identify studies that addressed Arabic Text-to-Speech systems. Of the 3719 articles identified, only 36 (0.96%) met our search criteria. Bibliometric analyses of these studies were conducted and reported. The results highlight the main types of speech synthesis techniques used in TTS systems: concatenative, formant, deep neural network (DNN), hybrid models, and multiagent. The corpora used to develop these systems, as well as the diacritization techniques incorporated, evaluation techniques, and the results of the performance of the systems are reported. Subjective evaluation using the mean opinion score is the most commonly applied method to measure the accuracy of systems. This study also identifies gaps in the literature and makes recommendations for future research directions.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?