Automated Prediction of Medieval Arabic Diacritics

Khalid Alnajjar,Mika Hämäläinen,Niko Partanen,Jack Rueter
DOI: https://doi.org/10.48550/arXiv.2010.05269
2020-10-11
Abstract:This study uses a character level neural machine translation approach trained on a long short-term memory-based bi-directional recurrent neural network architecture for diacritization of Medieval Arabic. The results improve from the online tool used as a baseline. A diacritization model have been published openly through an easy to use Python package available on PyPi and Zenodo. We have found that context size should be considered when optimizing a feasible prediction model.
Computation and Language
What problem does this paper attempt to address?