Abstract:Explainable phishing detection approaches are usually based on references, i.e., they compare a suspicious webpage against a reference list of commonly targeted legitimate brands' webpages. If a webpage is detected as similar to any referenced website but their domains are not aligned, a phishing alert is raised with an explanation comprising its targeted brand. In comparison to other techniques, such explainable reference-based solutions are more robust to ever-changing phishing webpages. However, the webpage similarity is still measured by representations conveying only partial intentions (e.g., screenshot and logo), which (i) incurs considerable false positives and (ii) gives an adversary opportunities to compromise user confidence in the approaches. In this work, we propose, PhishIntention, to extract precise phishing intention of a webpage by visually (i) extracting its brand intention and credential-taking intention, and (ii) interacting with the webpage to confirm the credential-taking intention. We design PhishIntention as a heterogeneous system of deep learning vision models, overcoming various technical challenges. The models "look at" and "interact with" the webpage for its intention, which are robust to potential HTML obfuscation. We compare PhishIntention with four state-of-the-art reference-based approaches on the largest phishing identification dataset consisting of 50K phishing and benign webpages. For similar level of recall, PhishIntention achieves significantly higher precision than the baselines. Moreover, we conduct a continuous field study on the Internet for two months to discover emerging phishing webpages. PhishIntention detects 1,942 new phishing webpages (1,368 not reported by VirusTotal). Comparing to the best baseline, PhishIntention generates 86.5% less false alerts (139 vs. 1,033 false positives) while detecting similar number of real phishing webpages. Our models and code are available at https: //github.com/lindsey98/PhishIntention.git.

An Explainable Multi-Modal Hierarchical Attention Model for Developing Phishing Threat Intelligence

Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages

Phishing Website Detection through Multi-Model Analysis of HTML Content

Real-Time Phishing Detection Based on URL Multi-Perspective Features: Aiming at the Real Web Environment.

Inferring Phishing Intention via Webpage Appearance and Dynamics: A Deep Vision Based Approach

A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts

CNN-MHSA: A Convolutional Neural Network and multi-head self-attention combined approach for detecting phishing websites.

Web2Vec: Phishing Webpage Detection Method Based on Multidimensional Features Driven by Deep Learning

Phishing Detection Based on Multi-Feature Neural Network.

CCBLA: a Lightweight Phishing Detection Model Based on CNN, BiLSTM, and Attention Mechanism

A hybrid DNN-LSTM model for detecting phishing URLs

PhishAgent: A Robust Multimodal Agent for Phishing Webpage Detection

Multi-scale semantic deep fusion models for phishing website detection

Multimodal Large Language Models for Phishing Webpage Detection and Identification

Phishing Webpage Detection via Multi-Modal Integration of HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networks

Towards a Multi-Layered Phishing Detection

Can Features for Phishing URL Detection Be Trusted Across Diverse Datasets? A Case Study with Explainable AI

KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection

Phishing website detection: How effective are deep learning‐based models and hyperparameter optimization?

Phishing Websites Detection Via CNN and Multi-Head Self-Attention on Imbalanced Datasets