Abstract:Data mining is the most widely used method for discovering knowledge. There are numerous data mining tasks, with classification being the most frequently encountered task in various application domains such as fraud detection, disease diagnosis, text classification, and so on. Many classification techniques, such as Bayesian classifiers, decision trees, genetic algorithms, neural networks (NNs), and so on, are available to help researchers solve problems in a variety of domains. However, NNs are the most frequently used classification approach because they are effective at solving classification problems that cannot be divided into linear and non-linear categories, have high classification accuracy on large datasets, and require minimal processing effort. Despite having good classification performances, NNs have a pitfall associated with them which hinders their applicability in some real-world applications. NNs are black boxes in nature, which means they cannot make transparent decisions that humans can interpret. Because of this limitation, NNs are unsuitable for many applications that require transparency in decision-making as well as high accuracy, such as audit mining or medical diagnosis. The well-known solution to this inherent disadvantage of NNs is to extract explainable decision rules from them. The extracted rules provide a detailed understanding of how NNs work in a human-readable format. Rule extraction is a well-established technique with a plethora of literature on the subject. However, there are very few papers whose primary goal is to survey the existing literature. As a result, the goal of this work is to provide a detailed analysis of the existing literature and to create a framework for existing and new researchers to conduct research in this field. The paper examines the state-of art from the perspective of designing framework of the algorithms, evaluation criteria, and applications.

Post-hoc Interpretability for Neural NLP: A Survey

Post-hoc Interpretability for Neural NLP: A Survey

Which Neural Network Makes More Explainable Decisions? an Approach Towards Measuring Explainability

Local Interpretations for Explainable Natural Language Processing: A Survey

A Survey of the Interpretability Aspect of Deep Learning Models

A Survey on Neural Network Interpretability

Interpretability of deep learning models: A survey of results

Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification

Testing the effectiveness of saliency-based explainability in NLP using randomized survey-based experiments

The future of human-centric eXplainable Artificial Intelligence (XAI) is not post-hoc explanations

Interpretability in Graph Neural Networks

Explainable Neural Networks: Achieving Interpretability in Neural Models

Can I Trust the Explainer? Verifying Post-hoc Explanatory Methods

Neuron-level Interpretation of Deep NLP Models: A Survey

Interpreting Deep Learning Models in Natural Language Processing: A Review

What is Interpretability?

Explainability of Text Processing and Retrieval Methods: A Critical Survey

In Defence of Post-hoc Explainability

Visual Interpretability for Deep Learning: a Survey

Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications