Attending Via Both Fine-tuning and Compressing.

Jie Zhou,Yuanbin Wu,Qin Chen,Xuanjing Huang,Liang He
DOI: https://doi.org/10.18653/v1/2021.findings-acl.189
2021-01-01
Abstract:Though being a primary trend for enhancing interpretability of neural networks, attention mechanism's reliability and validity are still under debate. In this paper, we try to purify attention scores to obtain a more faithful explanation of downstream models. Specifically, we propose a framework consisting of a learner and a compressor, which performs fine-tuning and compressing iteratively to enhance the performance and interpretability of the attention mechanism. The learner focuses on learning better text representations to achieve good decisions by fine-tuning, while the compressor aims to perform compressions over the representations to retain the most useful clues for explanations with a Variational information bottleneck ATtention (VAT) mechanism. Extensive experiments on eight benchmark datasets show the great advantages of our proposed approach in terms of both performance and interpretability.
What problem does this paper attempt to address?