Explainable Neural Networks: Achieving Interpretability in Neural Models
Manomita Chakraborty
DOI: https://doi.org/10.1007/s11831-024-10089-4
IF: 9.7
2024-03-22
Archives of Computational Methods in Engineering
Abstract:Data mining is the most widely used method for discovering knowledge. There are numerous data mining tasks, with classification being the most frequently encountered task in various application domains such as fraud detection, disease diagnosis, text classification, and so on. Many classification techniques, such as Bayesian classifiers, decision trees, genetic algorithms, neural networks (NNs), and so on, are available to help researchers solve problems in a variety of domains. However, NNs are the most frequently used classification approach because they are effective at solving classification problems that cannot be divided into linear and non-linear categories, have high classification accuracy on large datasets, and require minimal processing effort. Despite having good classification performances, NNs have a pitfall associated with them which hinders their applicability in some real-world applications. NNs are black boxes in nature, which means they cannot make transparent decisions that humans can interpret. Because of this limitation, NNs are unsuitable for many applications that require transparency in decision-making as well as high accuracy, such as audit mining or medical diagnosis. The well-known solution to this inherent disadvantage of NNs is to extract explainable decision rules from them. The extracted rules provide a detailed understanding of how NNs work in a human-readable format. Rule extraction is a well-established technique with a plethora of literature on the subject. However, there are very few papers whose primary goal is to survey the existing literature. As a result, the goal of this work is to provide a detailed analysis of the existing literature and to create a framework for existing and new researchers to conduct research in this field. The paper examines the state-of art from the perspective of designing framework of the algorithms, evaluation criteria, and applications.
computer science, interdisciplinary applications,engineering, multidisciplinary,mathematics