Abstract:Spam reviews are a pervasive problem on online platforms due to its significant impact on reputation. However, research into spam detection in data streams is scarce. Another concern lies in their need for transparency. Consequently, this paper addresses those problems by proposing an online solution for identifying and explaining spam reviews, incorporating data drift adaptation. It integrates (i) incremental profiling, (ii) data drift detection & adaptation, and (iii) identification of spam reviews employing Machine Learning. The explainable mechanism displays a visual and textual prediction explanation in a dashboard. The best results obtained reached up to 87 % spam F-measure.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve two main problems in spam reviews detection on online platforms: 1. **Spam reviews detection in data streams**: - Spam reviews are a common problem on online platforms because they have a significant impact on reputation. However, relatively few studies have been conducted on spam reviews detection in data streams. Traditional spam reviews detection methods are usually based on static data sets, while in practical applications, review data changes dynamically. Therefore, a method that can process and adapt to these changes in real - time is required. 2. **Transparency and interpretability**: - Another important problem with spam reviews detection systems is their transparency and interpretability. In order to make users trust and understand how the system works, the detection results need to be presented in an intuitive and easy - to - understand way. Existing methods often lack sufficient explanation mechanisms, making it difficult for users to understand why a certain review is marked as spam. To this end, the paper proposes an online spam reviews detection framework combined with data drift adaptation. The framework solves the problems through the following three key modules: - **Incremental Profiling**: Extract features from user - generated content through natural language processing (NLP) techniques and gradually update user profiles. - **Data Drift Detection & Adaptation**: Monitor changes in input data, identify and adapt to data drift, and ensure that the model can maintain high accuracy when the data distribution changes. - **Identification and Explanation of Spam Reviews**: Use machine learning methods to identify spam reviews and display prediction results on the dashboard through visualization and text explanation. Finally, the framework achieves an F - measure as high as 87%, demonstrating its efficiency and transparency in spam reviews detection. ### Formula summary - **Incremental average calculation**: \[ favg_{tk}=\frac{1}{k}\sum_{i = 0}^{k}f_{ti} \] where \(f\) represents a feature, and \([f_{to}, f_{t1},..., f_{tk}]\) represents the past feature data of each user. - **Incremental maximum calculation**: \[ fmax_{tk}=\max(f_{ti}) \] These formulas are used to calculate the incremental features of users and items, so as to better capture the trend of data changes over time.

Online detection and infographic explanation of spam reviews with data drift adaptation

Analyzing and Detecting Adversarial Spam on a Large-scale Online APP Review System.

TopicSpam: a Topic-Model Based Approach for Spam Detection.

Towards a General Rule for Identifying Deceptive Opinion Spam

Camouflage is NOT Easy: Uncovering Adversarial Fraudsters in Large Online App Review Platform

A review of spam email detection: analysis of spammer strategies and the dataset shift problem

Opinion Spam Detection: A New Approach Using Machine Learning and Network-Based Algorithms

Improving Opinion Spam Detection by Cumulative Relative Frequency Distribution

Interpretable and Effective Opinion Spam Detection Via Temporal Patterns Mining Across Websites

IFSpard: an Information Fusion-based Framework for Spam Review Detection

Spammer detection via ranking aggregation of group behavior

Text Mining and Probabilistic Language Modeling for Online Review Spam Detection.

Review Graph Based Online Store Review Spammer Detection

Learning to identify review spam

Exposing and explaining fake news on-the-fly

A novel time varying signal processing method for Coriolis mass flowmeter.

Review Spam Detection Via Temporal Pattern Discovery

Toward a Language Modeling Approach for Consumer Review Spam Detection

Temporal Opinion Spam Detection by Multivariate Indicative Signals

Review spam detection via time series pattern discovery.