Abstract:With the development of network technology, the number of gambling websites has grown dramatically, causing a threat to social stability. There are many machine learning-based methods are proposed to identify gambling websites by analyzing the URL, the text, and the images of the websites. Nevertheless, most of the existing methods ignore one important piece of information, i.e., the text within the website images. Only the visual features of images are extracted for detection, while the semantic features of texts on the images are ignored. However, these texts have key information clearly pointing to gambling websites, which can help us identify such websites more accurately. Therefore, how to fuse image and text multimodal data is a challenge that should be met.Motivated by this, in this paper, we propose a hybrid multimodal data fusion-based method for identifying gambling websites by extracting and fusing visual and semantic features of the website screenshots. First, we fine tune the pretrained ResNet34 model to train an image classifier and to extract visual features of webpage screenshots. Second, we extract textual content from webpage screenshots through the optical character recognition (OCR) technique. We use pretrained Word2Vec word vectors as the initial embedding layer and use Bi-LSTM to train a text classifier and extract semantic features of textual content on the screenshots. Third, we use self-attention to fuse the visual and semantic features and train a multimodal classifier. The prediction results of image, text, and multimodal classifiers are fused by the late fusion method to obtain the final prediction result. To demonstrate the effectiveness of the proposed method, we conduct experiments on the webpage screenshot dataset we collected. The experimental results indicate that OCR text on the webpage screenshots has strong semantic features and the proposed hybrid multimodal data fusion based method can effectively improve the performance in identifying gambling websites, with accuracy, precision, recall, and F1-score all over 99%.

Multimodal fraudulent website identification method based on heterogeneous model ensemble

Anomaly Identification Model for Telecom Users Based on Machine Learning Model Fusion.

A Hybrid Multimodal Data Fusion-Based Method for Identifying Gambling Websites

An ensemble classification method based on machine learning models for malicious Uniform Resource Locators (URL)

Image recognition model of fraudulent websites based on image leader decision and Inception-V3 transfer learning

Multiplex graph fusion network with reinforcement structure learning for fraud detection in online e-commerce platforms

Spotting Sneaky Scammers: Malicious Account Detection from a Chinese Financial Platform

Malicious URL Detection via Pretrained Language Model Guided Multi-Level Feature Attention Network

Phishing Website Detection Based on Deep Convolutional Neural Network and Random Forest Ensemble Learning

MEDAL: A Multimodality-Based Effective Data Augmentation Framework for Illegal Website Identification

A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts

An Ensemble-based Fraud Detection Model for Financial Transaction Cyber Threat Classification and Countermeasures

Financial Fraud Detection: a New Ensemble Learning Approach for Imbalanced Data.

Phishing Websites Detection Via CNN and Multi-Head Self-Attention on Imbalanced Datasets

Multi-scale semantic deep fusion models for phishing website detection

Enhancing Credit Card Fraud Detection A Neural Network and SMOTE Integrated Approach

A Late Multi-Modal Fusion Model for Detecting Hybrid Spam E-mail

Multimodal and Contrastive Learning for Click Fraud Detection

A Text Classification Model Combining Adversarial Training with Pre-trained Language Model and neural networks: A Case Study on Telecom Fraud Incident Texts

A stacking model using URL and HTML features for phishing webpage detection