DTOR: Decision Tree Outlier Regressor to explain anomalies

Riccardo Crupi,Daniele Regoli,Alessandro Damiano Sabatino,Immacolata Marano,Massimiliano Brinis,Luca Albertazzi,Andrea Cirillo,Andrea Claudio Cosentini

2024-05-13

Abstract:Explaining outliers occurrence and mechanism of their occurrence can be extremely important in a variety of domains. Malfunctions, frauds, threats, in addition to being correctly identified, oftentimes need a valid explanation in order to effectively perform actionable counteracts. The ever more widespread use of sophisticated Machine Learning approach to identify anomalies make such explanations more challenging. We present the Decision Tree Outlier Regressor (DTOR), a technique for producing rule-based explanations for individual data points by estimating anomaly scores generated by an anomaly detection model. This is accomplished by first applying a Decision Tree Regressor, which computes the estimation score, and then extracting the relative path associated with the data point score. Our results demonstrate the robustness of DTOR even in datasets with a large number of features. Additionally, in contrast to other rule-based approaches, the generated rules are consistently satisfied by the points to be explained. Furthermore, our evaluation metrics indicate comparable performance to Anchors in outlier explanation tasks, with reduced execution time.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

This paper proposes a solution to the interpretability problem in anomaly detection in machine learning. In internal audits in sectors such as banking, anomaly detection techniques are used to identify unusual data points such as faults, fraud, or threats. However, the interpretability of these techniques is a challenge because explanations need to be provided to internal auditors who may not have expertise in data analysis. Existing explanation methods, such as SHAP based on feature importance, may have limited interpretability in complex models or high-dimensional datasets. The paper introduces a new approach called Decision Tree Anomaly Regressor (DTOR), which generates rule-based explanations by estimating anomaly scores generated by the anomaly detection model. DTOR utilizes a decision tree regressor to compute the estimated score and extracts the paths associated with the data point's score. This approach performs robustly on datasets with a large number of features and generates more relevant rules for the data point to be explained. Compared to other rule-based explanation methods like Anchors, DTOR has shorter execution time and comparable performance in anomaly explanation tasks. The innovation of DTOR lies in its ability to provide transparent decision logic for anomaly detection models, enabling non-data scientists to understand the reasons for anomalies occurring and thus improving risk assessment and decision-making in banking systems. In this way, DTOR enhances the efficiency and security of internal audits in banking systems.

DTOR: Decision Tree Outlier Regressor to explain anomalies

Decision Tree Regression with Residual Outlier Detection

Rule-based Out-Of-Distribution Detection

Local Rule-Based Explanations of Black Box Decision Systems

BELLATREX: Building Explanations through a LocaLly AccuraTe Rule EXtractor

DORO: Distributional and Outlier Robust Optimization

Explaining outliers and anomalous groups via subspace density contrastive loss

PUPAE: Intuitive and Actionable Explanations for Time Series Anomalies

On Predictive Explanation of Data Anomalies

DORA: Exploring Outlier Representations in Deep Neural Networks

Towards Meaningful Anomaly Detection: The Effect of Counterfactual Explanations on the Investigation of Anomalies in Multivariate Time Series

RGMDT: Return-Gap-Minimizing Decision Tree Extraction in Non-Euclidean Metric Space

GPTree: Towards Explainable Decision-Making via LLM-powered Decision Trees

Trusting deep learning natural-language models via local and global explanations

Robust Explainer Recommendation for Time Series Classification

DARE: Towards Robust Text Explanations in Biomedical and Healthcare Applications

AcME-AD: Accelerated Model Explanations for Anomaly Detection

Coevolutionary Algorithm for Building Robust Decision Trees under Minimax Regret

An Explainable Bayesian Decision Tree Algorithm

Interpretable Outlier Summarization

Interpreting Unsupervised Anomaly Detection in Security Via Rule Extraction