The Explainability of Transformers: Current Status and Directions

Paolo Fantozzi,Maurizio Naldi
DOI: https://doi.org/10.3390/computers13040092
2024-04-04
Computers
Abstract:An increasing demand for model explainability has accompanied the widespread adoption of transformers in various fields of applications. In this paper, we conduct a survey of the existing literature on the explainability of transformers. We provide a taxonomy of methods based on the combination of transformer components that are leveraged to arrive at the explanation. For each method, we describe its mechanism and survey its applications. We find out that attention-based methods, both alone and in conjunction with activation-based and gradient-based methods, are the most employed ones. A growing attention is also devoted to the deployment of visualization techniques to help the explanation process.
What problem does this paper attempt to address?