Abstract:Meanwhile, products and services reviews' provide a guide for potential customers allowing them to reach real knowledge about such products/services while making decisions. Sentiment classification is the task of analyzing opinions expressed in textual reviews automatically. The efficiency of this task is influenced by the set of representative features extracted from the reviews. Nevertheless, the value of extracted features lies as well in those that highly contribute to the classification process. Here comes the role of dimensionality reduction to eliminate the noise and reduce the feature high space while preserving required accuracies. The Arabic language and its datasets have inherent challenges. Besides, most sentiment classification studies integrating dimensionality reduction have focused on English texts, with only few studies conducted for other languages including Arabic. Massive amounts of Arabic data have been generated due to the huge population of the Arab world, and despite that, the aforementioned technical gaps are still existing for such language. This paper proposes a supervised learning approach for Arabic reviews sentiment classification. This approach utilizes optimized compact features that depend on a well representative feature set coupled with feature reduction techniques, which manages to guarantee high accuracy and time/space savings simultaneously. The employed feature set includes a triple combination of <i>N</i>-gram features and positive/negative <i>N</i>-grams counts features obtained after considering negation handling. The proposed approach examines two different linear transformation methods; principal component analysis (PCA) as an unsupervised transformation method and latent Dirichlet allocation (LDA) as a supervised transformation method. A spam detection process is executed prior to the learning for the purpose of increasing the classifier robustness. The proposed approach has been experimented with five Arabic opinion text datasets, of different domains and varying sizes (1.6 up to 94 K reviews). Experiments have been conducted for two-class (positive/negative sentiments) and three-class (positive/negative/neutral sentiments) classification problems. Accuracy values have been recorded in the range of 95.5–99.8% for the two-class classification problem and 92–97.3% for the three-class classification problem. The LDA feature reduction outperformed PCA by an average of 4.34% and 3.52% in accuracy and F1 Score measures, respectively. The overall approach outperformed the existing related works in literature by far of 23% and 34% for accuracy and F1 Score, respectively. The experimental studies and the obtained results show the efficiency of the proposed solution, which employs optimized features that rely on integrating a feature reduction module, together with a well representative feature set based on negation handled triple combination of N-gram features and positive/negative N-grams counts features. The overall results demonstrate great improvement with 24% increase in accuracy, 93% savings in the feature space, and 97% decrease in the classification execution time.

Improving Sentiment Analysis of Arabic Tweets by One-way ANOVA

Sentiment Analysis of Arabic Tweets: Feature Engineering and A Hybrid Approach

Sentiment Analysis of Arab Tweets: Unveiling Public Opinion Trends Using Machine Learning

Sentiment analysis of imbalanced Arabic data using sampling techniques and classification algorithms

Improving Sentiment Analysis in Arabic Using Word Representation

Effect of Word Embedding Variable Parameters on Arabic Sentiment Analysis Performance

Sentiment analysis of Arabic social media texts: A machine learning approach to deciphering customer perceptions

Optimizing Sentiment Classification for Arabic Opinion Texts

Sentiment Analysis For Modern Standard Arabic And Colloquial

A machine learning-based approach for sentiment analysis on distance learning from Arabic Tweets

Arabic Language Sentiment Analysis on Health Services

Heterogeneous Ensemble Deep Learning Model for Enhanced Arabic Sentiment Analysis

Sentiment Analysis on Arabic Public Opinions toward COVID-19 Vaccines Using Twitter Data

Aspect-based Sentiment Analysis and Location Detection for Arabic Language Tweets

Natural Language Processing for Arabic Sentiment Analysis: A Systematic Literature Review

Twitter sentiment analysis: An Arabic text mining approach based on COVID-19

Sentiment Analysis for Arabic in Social Media Network: A Systematic Mapping Study

Generative artificial intelligence in topic-sentiment classification for Arabic text: a comparative study with possible future directions

Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis

Sentiment analysis for Arabic language: A brief survey of approaches and techniques

A Comparative Study of Feature Selection Methods for Dialectal Arabic Sentiment Classification Using Support Vector Machine