Contribution to the Moroccan Darija sentiment analysis in social networks

Sara El Ouahabi,Safâa El Ouahabi,El Wardani Dadi
DOI: https://doi.org/10.1007/s13278-023-01129-1
2023-10-21
Social Network Analysis and Mining
Abstract:With the rise of social media, there has been a growing interest in developing automatic sentiment analysis and opinion mining tools for natural language processing (NLP). However, most of the current research focuses on Indo-European languages, particularly English. However, a large community of people who use dialectics is not being adequately served by these existing tools. To our knowledge, there is currently no publicly available dataset for sentiment analysis specifically for the Moroccan dialect (MAD) that covers all social networks. In this work, we aim to address this issue by focusing on sentiment analysis for the Moroccan Arabic dialect (Darija), by creating a large and high-quality dataset of Moroccan dialectal text extracted from different social media (Facebook, Twitter, YouTube, Instagram and Web site) that covers a wide range of domains including sports, arts, politics, education and society. It is characterized by its size, quality, and variety, and involves experimenting with different machine learning algorithms, feature extraction models, and testing the transformer-based model (BERT).
What problem does this paper attempt to address?