Abstract:Organizations and individuals worldwide are becoming increasingly vulnerable to cyberattacks as phishing continues to grow and the number of phishing websites grows. As a result, improved cyber defense necessitates more effective phishing detection (PD). In this paper, we introduce a novel method for detecting phishing sites with high accuracy. Our approach utilizes a Convolution Neural Network (CNN)-based model for precise classification that effectively distinguishes legitimate websites from phishing websites. We evaluate the performance of our model on the PhishTank dataset, which is a widely used dataset for detecting phishing websites based solely on Uniform Resource Locators (URL) features. Our approach presents a unique contribution to the field of phishing detection by achieving high accuracy rates and outperforming previous state-of-the-art models. Experiment results revealed that our proposed method performs well in terms of accuracy and its false-positive rate. We created a real data set by crawling 10,000 phishing URLs from PhishTank and 10,000 legitimate websites and then ran experiments using standard evaluation metrics on the data sets. This approach is founded on integrated and deep learning (DL). The CNN-based model can distinguish phishing websites from legitimate websites with a high degree of accuracy. When binary-categorical loss and the Adam optimizer are used, the accuracy of the k-nearest neighbors (KNN), Natural Language Processing (NLP), Recurrent Neural Network (RNN), and Random Forest (RF) models is 87%, 97.98%, 97.4% and 94.26%, respectively, in contrast to previous publications. Our model outperformed previous works due to several factors, including the use of more layers and larger training sizes, and the extraction of additional features from the PhishTank dataset. Specifically, our proposed model comprises seven layers, starting with the input layer and progressing to the seventh, which incorporates a layer with pooling, convolutional, linear 1 and 2, and linear six layers as the output layers. These design choices contribute to the high accuracy of our model, which achieved a 98.77% accuracy rate.

Research on phishing webpage detection technology based on CNN-BiLSTM algorithm

STFN: Spatio-Temporal Fusion Network to Detect Ethereum Phishing Scams

A Malicious URL Detection Method Based on CNN

URL based phishing attack detection using BiLSTM-gated highway attention block convolutional neural network

Phishing Websites Detection Via CNN and Multi-Head Self-Attention on Imbalanced Datasets

A Deep Learning-Based Phishing Detection System Using CNN, LSTM, and LSTM-CNN

CCBLA: a Lightweight Phishing Detection Model Based on CNN, BiLSTM, and Attention Mechanism

CNN-MHSA: A Convolutional Neural Network and multi-head self-attention combined approach for detecting phishing websites.

Detecting phishing websites through improving convolutional neural networks with Self-Attention mechanism

A hybrid DNN-LSTM model for detecting phishing URLs

Phishing Website Detection Based on Deep Convolutional Neural Network and Random Forest Ensemble Learning

Phishing Detection Based on Multi-Feature Neural Network.

A Deep Learning-Based Innovative Technique for Phishing Detection in Modern Security with Uniform Resource Locators

Protect sensitive sites from phishing attacks using features extractable from inaccessible phishing URLs

Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages

Web2Vec: Phishing Webpage Detection Method Based on Multidimensional Features Driven by Deep Learning

A Survey of Machine Learning-Based Solutions for Phishing Website Detection

Phishing Webpage Detection via Multi-Modal Integration of HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networks

Advancing Phishing Email Detection: A Comparative Study of Deep Learning Models

An effective detection approach for phishing websites using URL and HTML features

PDSMV3-DCRNN: A Novel Ensemble Deep Learning Framework for Enhancing Phishing Detection and URL Extraction