Detecting Spam Reviews on Vietnamese E-commerce Websites

Co Van Dinh,Son T. Luu,Anh Gia-Tuan Nguyen

DOI: https://doi.org/10.1007/978-3-031-21743-2_48

2022-12-09

Abstract:The reviews of customers play an essential role in online shopping. People often refer to reviews or comments of previous customers to decide whether to buy a new product. Catching up with this behavior, some people create untruths and illegitimate reviews to hoax customers about the fake quality of products. These are called spam reviews, confusing consumers on online shopping platforms and negatively affecting online shopping behaviors. We propose the dataset called ViSpamReviews, which has a strict annotation procedure for detecting spam reviews on e-commerce platforms. Our dataset consists of two tasks: the binary classification task for detecting whether a review is spam or not and the multi-class classification task for identifying the type of spam. The PhoBERT obtained the highest results on both tasks, 86.89% and 72.17%, respectively, by macro average F1 score.

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

The paper attempts to address the problem of detecting spam reviews on Vietnamese e-commerce websites. Specifically, the authors constructed a dataset named ViSpamReviews, which contains over 19,000 user reviews manually annotated to identify spam reviews and their types through a rigorous annotation process. The paper proposes two tasks: the first task is a binary classification task to determine whether a review is spam or not; the second task is a multi-class classification task to identify the specific type of spam review. The authors also applied various classification models, including deep neural network-based models (such as Text-CNN, LSTM, GRU) and Transformer-based models (such as PhoBERT and BERT4News), and evaluated their performance on the dataset. Through experimental results, the PhoBERT model achieved the best performance on both tasks, with macro-average F1 scores of 86.89% and 72.17%, respectively. Additionally, the authors analyzed the mispredictions and found that the main challenge lies in distinguishing between normal and spam reviews, especially when the review content is short or only involves the brand rather than the product itself. Finally, the authors proposed future research directions, including expanding the dataset to detect spam paragraphs within reviews and identifying user opinions on specific product features and related services.

Detecting Spam Reviews on Vietnamese E-commerce Websites

Metadata Integration for Spam Reviews Detection on Vietnamese E-commerce Websites

Detecting Vietnamese Opinion Spam

Analyzing and Detecting Adversarial Spam on a Large-scale Online APP Review System.

TopicSpam: a Topic-Model Based Approach for Spam Detection.

Opinion Spam Recognition Method for Online Reviews using Ontological Features

Vietnamese Complaint Detection on E-Commerce Websites

Towards a General Rule for Identifying Deceptive Opinion Spam

Content-based Approach for Vietnamese Spam SMS Filtering

Detecting Spam Reviews in Arabic by Deep Learning

A Comparative Study of Sentiment Analysis Methods for Detecting Fake Reviews in E-Commerce

T-Bert: A Spam Review Detection Model Combining Group Intelligence and Personalized Sentiment Information

An Efficient Model for Sentiment Analysis of Electronic Product Reviews in Vietnamese

Learning to identify review spam

Reinforcement of Pre-trained Bert Architecture for the Detection of Spam Reviews

Toward a Language Modeling Approach for Consumer Review Spam Detection

Designing a deep learning-based application for detecting fake online reviews

Towards Online Review Spam Detection

Vietnamese AI Generated Text Detection

Sentiment Analysis of Customer Feedback in Online Food Ordering Services

Opinion Spam Detection: A New Approach Using Machine Learning and Network-Based Algorithms