Abstract:Background: Explainability in phishing detection model can support a further solution of phishing attack mitigation by increasing trust and understanding how phishing can be detected. Objective: The aims of this study to determine and best recommendation to apply an approach which has several components with abilities to fulfil the critical needs Methods: A methodology starting with analyzing both black-box and white-box models to get the pros and cons specifically in phishing detection. The conclusion of the analysis will be validated by experiment using a set of well-known algorithms and public phishing datasets. Experimental metrics covers 3 measurements such as predictive accuracy and explainability metrics. Conclusion: Both models are comparable in terms of interpretability and consistency, with room for improvement in diverse datasets. EBM as an example of white-box model is generally better suited for applications requiring explainability and actionable insights. Finally, each model, white-box and black-box model has positive and negative aspects both for performance metric and for explainable metric. It is important to consider the objective of model usage.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the trade - off problem between the explainability of the model and the prediction accuracy in phishing attack detection. Specifically, the research objectives are: 1. **Determine the advantages and disadvantages of black - box and white - box models in phishing detection**: By comparing and analyzing black - box models (such as deep neural networks, random forests, etc.) and white - box models (such as decision trees, logistic regression, etc.), understand their performance in phishing detection. 2. **Provide the best recommendation scheme**: According to the analysis results, select the most appropriate model type for practical applications to meet the key requirements, especially in application scenarios where explainability is required. 3. **Enhance the understanding and trust of phishing attacks**: By improving the explainability of the model, increase users' trust in the phishing detection system and help users better understand how phishing attacks are detected. 4. **Evaluate the performance and explainability of different models**: Use public datasets and a series of known algorithms for experimental verification, and evaluate the performance of models in terms of prediction accuracy and explainability. 5. **Explore the application of Explainable Artificial Intelligence (XAI) techniques**: Research how to use XAI techniques (such as LIME, SHAP, etc.) to explain the decision - making process of black - box models, thereby enhancing their transparency and credibility. ### Research background Phishing is a common means of cybercrime, which brings serious security threats to individuals and organizations. Machine learning models are widely used in phishing attack detection, but the decision - making processes inside many models (especially black - box models) are difficult to explain, which limits their application in high - risk areas. Therefore, improving the explainability of models has become an important research direction. ### Main objectives - **Improve the transparency of the phishing detection system**: By explaining the decision - making process of the model, enable users to understand why a certain website or email is considered a phishing attack. - **Enhance user trust**: By providing clear explanations, make users trust the results of the detection system more. - **Optimize model selection**: According to specific requirements (such as prediction accuracy, explainability, etc.), recommend the most suitable model type for practical applications. ### Methods - **Analyze black - box and white - box models**: Evaluate the advantages and disadvantages of different types of models from multiple dimensions (such as prediction accuracy, explainability, consistency, etc.). - **Experimental verification**: Use public datasets for experiments to verify the performance of different models, and ensure the reliability of the results through statistical tests. - **Apply XAI techniques**: For black - box models, use techniques such as LIME and SHAP to generate explanations and improve their transparency. ### Conclusions Through the comparative analysis of black - box and white - box models, the research found that: - **EBM (Explainable Boosting Machine)**, as an example of a white - box model, is usually more suitable for application scenarios that require explainability and operability. - **Each model has its own advantages and disadvantages**: They have their own characteristics in terms of performance and explainability, and the specific choice should be determined according to the requirements of the application scenario. In general, this paper emphasizes the importance of the explainability of the model in phishing detection for improving user trust and system reliability.

Comparative Analysis of Black-Box and White-Box Machine Learning Model in Phishing Detection

Enhancing Phishing Detection through Feature Importance Analysis and Explainable AI: A Comparative Study of CatBoost, XGBoost, and EBM Models

Comparative evaluation of machine learning algorithms for phishing site detection

Investigation of Phishing Susceptibility with Explainable Artificial Intelligence

PhishGuard: A Convolutional Neural Network Based Model for Detecting Phishing URLs with Explainability Analysis

Can Features for Phishing URL Detection Be Trusted Across Diverse Datasets? A Case Study with Explainable AI

Analysis of the Performance Impact of Fine-Tuned Machine Learning Model for Phishing URL Detection

Intelligent Methods for Accurately Detecting Phishing Websites

Mitigating Bias in Machine Learning Models for Phishing Webpage Detection

An efficient multistage phishing website detection model based on the CASE feature framework: Aiming at the real web environment

Man versus Machine: AutoML and Human Experts' Role in Phishing Detection

A Survey of Machine Learning-Based Solutions for Phishing Website Detection

Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages

Phishing website detection: How effective are deep learning‐based models and hyperparameter optimization?

Light gradient boosting machine-based phishing webpage detection model using phisher website features of mimic URLs

Comparative Study of CatBoost, XGBoost, and LightGBM for Enhanced URL Phishing Detection: A Performance Assessment

AI Meta-Learners and Extra-Trees Algorithm for the Detection of Phishing Websites

Novel Interpretable and Robust Web-based AI Platform for Phishing Email Detection

Phishing Website Detection through Multi-Model Analysis of HTML Content

A Comprehensive Analysis of Explainable AI for Malware Hunting

A New Generation Gap? Some Thoughts on the Consequences of Early ICT First Contact