Abstract:In recent years, the prevalence of fake reviews on online platforms has become a significant concern, as these deceptive reviews can mislead consumers and impact purchasing decisions. This research paper explores various methods for detecting fake reviews using machine learning techniques. We utilized the deceptive opinion spam dataset, which includes both truthful and deceptive hotel reviews for 20 Chicago hotels. The dataset comprises 1600 reviews, evenly split between truthful positive, deceptive positive, truthful negative, and deceptive negative reviews. Our primary objective was to classify these reviews as either truthful or deceptive using several machine learning algorithms. We constructed a data frame with columns for the review text, polarity class, and spamity class. Polarity indicates whether a review is positive or negative, while spamity distinguishes between truthful and deceptive reviews. Stopwords were removed from the reviews using the nltk package from sklearn, and text mining techniques were applied to convert text strings into numerical data. We also extracted parts of speech from the reviews to use as features in our models. We experimented with four classification techniques: Naïve-Bayes, Support Vector Machine (SVM), Decision Tree, and Random Forest classifiers. The Naïve-Bayes classifier, specifically the Multinomial NB algorithm, achieved an accuracy of 89.13%. The SVM yielded an accuracy of 82.155%, while the Decision Tree algorithm resulted in an accuracy of 65.55%. The Random Forest classifier demonstrated the highest accuracy, reaching 91.72%. Confusion matrices, generated using the sklearn metric module, validated the accuracy of each algorithm. Given the superior performance of the Random Forest classifier and Naïve-Bayes, these models were selected for further analysis. Our findings indicate that these machine learning techniques can effectively identify fake reviews, thereby helping to mitigate their misleading impact on consumers. For future work, we aim to expand our study to include datasets from other platforms such as Amazon and flipakrt, and to explore different feature selection methods. We also plan to apply sentiment classification algorithms using various tools like Python, R, Statistical Analysis System (SAS), and Stata, to detect fake reviews and evaluate the performance of these tools. This research was supported by the Technical University of Kerala. We extend our gratitude to our colleagues for their expertise, which significantly aided the research, although they may not concur with all the interpretations presented in this paper. Through this study, we contribute to the ongoing efforts to enhance the reliability of online reviews and protect consumers from deceptive practices.

Finding fake reviews in e-commerce platforms by using hybrid algorithms

Detection of Fake Online Reviews Using Semi Supervised and Supervised Learning

Survey on Various Tool for Analyzing and Detecting Fake Review by using AI

Fake Reviews Detection using Supervised Machine Learning Algorithms

Factitious or fact? Learning textual representations for fake online review detection

A Deep Hybrid Model for fake review detection by jointly leveraging review text, overall ratings, and aspect ratings

Modelling a dense hybrid network model for fake review analysis using learning approaches

Fake or Genuine? Contextualised Text Representation for Fake Review Detection

Unmasking deception: a CNN and adaptive PSO approach to detecting fake online reviews

Fake Review Detection Using Behavioral and Contextual Features

SUH-AIFRD: A self-training-based hybrid approach for individual fake reviewer detection

Data Analytics for the Identification of Fake Reviews Using Supervised Learning

Unmasking Falsehoods in Reviews: An Exploration of NLP Techniques

A Comparative Study of Sentiment Analysis Methods for Detecting Fake Reviews in E-Commerce

Ontology based sentiment analysis for fake review detection

Creating and detecting fake reviews of online products

An Ensemble Model for Fake Online Review Detection Based on Data Resampling, Feature Pruning, and Parameter Optimization

Enhanced Review Detection and Recognition: A Platform-Agnostic Approach with Application to Online Commerce

Fake online review recognition algorithm and optimisation research based on deep learning

Automatic detection of fake reviews at marketplaces using expert-based features and consumers’ reactions

Intelligent fake reviews detection based on aspect extraction and analysis using deep learning