Transferring Informal Text in Arabic as Low Resource Languages: State-of-the-Art and Future Research Directions

Ebtesam H. Almansor,Ahmed Al-Ani,Farookh Khadeer Hussain
DOI: https://doi.org/10.1007/978-3-030-22354-0_17
2019-06-21
Abstract:Rapid growth in internet technology lead to increase the usage of social media platforms which make communication between users easier. Through the communication users used their daily languages which considered as non-standard language. The non-slandered text contains lots of noise, such as abbreviations, slang which used more in English languages and dialect words which are widely used in Arabic language. These texts face challenging using any natural language processing tools. Therefore, these texts need to be treated and transferred to be similar to their standard form. According to that the normalization and translation approach have been used to transfer the informal text. However, using these approach need large label or parallel datasets. While high resource languages such as English have enough parallel datasets, low resource languages such as Arabic is lack of enough parallel dataset. Therefore, in this paper we focus on the Arabic and Arabic dialects as a low resource language in the era of transferring non-stander text using normalization and translation approach.
What problem does this paper attempt to address?