Abstract:International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Volume 32, Issue 01, Page 1-20, January 2024. Currently, social media networks such as Facebook and Twitter have evolved into valuable platforms for global communication. However, due to their extensive user bases, Twitter is often misused by illegitimate users engaging in illicit activities. While there are numerous research papers available that delve into combating illegitimate users on Twitter, a common shortcoming in most of these works is the failure to address the issue of class imbalance, which significantly impacts the effectiveness of spam detection. Few other research works that have addressed class imbalance have not yet applied bio-inspired algorithms to balance the dataset. Therefore, we introduce PSOB-U, a particle swarm optimization-based undersampling technique designed to balance the Twitter dataset. In PSOB-U, various classifiers and metrics are employed to select majority samples and rank them. Furthermore, an ensemble learning approach is implemented to combine the base classifiers in three stages. During the training phase of the base classifiers, undersampling techniques and a cost-sensitive random forest (CS-RF) are utilized to address the imbalanced data at both the data and algorithmic levels. In the first stage, imbalanced datasets are balanced using random undersampling, particle swarm optimization-based undersampling, and random oversampling. In the second stage, a classifier is constructed for each of the balanced datasets obtained through these sampling techniques. In the third stage, a majority voting method is introduced to aggregate the predicted outputs from the three classifiers. The evaluation results demonstrate that our proposed method significantly enhances the detection of illegitimate users in the imbalanced Twitter dataset. Additionally, we compare our proposed work with existing models, and the predicted results highlight the superiority of our spam detection model over state-of-the-art spam detection models that address the class imbalance problem. The combination of particle swarm optimization-based undersampling and the ensemble learning approach using majority voting results in more accurate spam detection.

Thematic context vector association based on event uncertainty for Twitter

Event Evolution Model for Cybersecurity Event Mining in Tweet Streams

Event detection in Twitter: A keyword volume approach

Bio-Inspired Algorithm Based Undersampling Approach and Ensemble Learning for Twitter Spam Detection

An event detection technique using social media data

Deep Learning based Topic Analysis on Financial Emerging Event Tweets

The early bird catches the term: combining twitter and news data for event detection and situational awareness

Event detection in Colombian security Twitter news using fine-grained latent topic analysis

Enhanced Twitter Sentiment Classification Using Contextual Information

Predicting word vectors for microtext

EviDense: a Graph-based Method for Finding Unique High-impact Events with Succinct Keyword-based Descriptions

Temporal Analysis on Topics Using Word2Vec

Context Based Model for Temporal Twitter Summarization

Event Detection and Summarization Using Phrase Network

Measuring, Predicting and Visualizing Short-Term Change in Word Representation and Usage in VKontakte Social Network

Using semantic clustering to support situation awareness on Twitter: the case of world views

A deep semantic matching approach for identifying relevant messages for social media analysis

EnrichEvent: Enriching Social Data with Contextual Information for Emerging Event Extraction

A Multi-View Clustering Model For Event Detection In Twitter

Probabilistic Model of Narratives Over Topical Trends in Social Media: A Discrete Time Model

Text embedding techniques for efficient clustering of twitter data