Advancing Interpretability in Text Classification through Prototype Learning

Bowen Wei,Ziwei Zhu
2024-10-24
Abstract:Deep neural networks have achieved remarkable performance in various text-based tasks but often lack interpretability, making them less suitable for applications where transparency is critical. To address this, we propose ProtoLens, a novel prototype-based model that provides fine-grained, sub-sentence level interpretability for text classification. ProtoLens uses a Prototype-aware Span Extraction module to identify relevant text spans associated with learned prototypes and a Prototype Alignment mechanism to ensure prototypes are semantically meaningful throughout training. By aligning the prototype embeddings with human-understandable examples, ProtoLens provides interpretable predictions while maintaining competitive accuracy. Extensive experiments demonstrate that ProtoLens outperforms both prototype-based and non-interpretable baselines on multiple text classification benchmarks. Code and data are available at \url{<a class="link-external link-https" href="https://anonymous.4open.science/r/ProtoLens-CE0B/" rel="external noopener nofollow">this https URL</a>}.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?